Exploring part-of-speech profiles and authorship attribution in Early Modern medical texts
Historical linguists frequently find themselves working with primary texts of uncertain or dubious origin. Sometimes the author of a text is not known at all or the authorship has been contested on the basis of book-historical evidence; but, whatever the reason is, uncertainties about authorship can lead to problems if the linguistic characteristics of the text are ascribed to the supposed or conventionally accepted author. This exploratory paper evaluates the usefulness of a method of authorship attribution that is based on cluster analysis of part-of-speech frequencies. While far from perfect, the method is shown to be a useful addition to the methodological toolkit of the historical corpus linguist by allowing quick diagnostic analysis of similarities between texts.