Full text loading...
-
The hapax / type ratio
An indicator of minimally required sample size in productivity studies?
- Source: International Journal of Corpus Linguistics, Volume 27, Issue 2, Jun 2022, p. 166 - 190
-
- 20 Oct 2019
- 20 Oct 2021
- 09 Mar 2022
Abstract
Abstract
This article addresses one of the lesser-known productivity measures, namely the hapax / type ratio (HTR). Through a case study involving the Dutch semi-copula raken (“attain”), it is shown that the HTR more or less stabilizes from a certain sample size onwards. Moreover, this point of stabilization seems to coincide with an increased permanency of the hapaxes, i.e. the share of hapaxes that convert quickly to non-hapaxes is not as large as was the case at the beginning of the sampling process. Therefore, the stabilization of the HTR might be a good indicator of minimally required sample size in productivity studies, suggesting that the hapaxes are ‘non-incidental’ from this sample size onwards. However, I did not find a clear link between the onset of the stabilization of the HTR and the extent to which the inventory of types accounted for at the top of the frequency distribution is (quasi-)complete.