Volume 25, Issue 1
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



Lexical dispersion is typically measured across arbitrary corpus parts of equal size. In this study, we apply  – a new dispersion index designed for unequal-sized corpus parts – to the British National Corpus (BNC) in a series of cases studies to show that the dispersion of a word is strongly influenced by the corpus units or parts it is measured across. Our results show that dispersion should be measured and interpreted based on corpus units that are linguistically meaningful for a particular research goal. We conclude with recommendations to help researchers select meaningful corpus units for measuring and interpreting lexical dispersion.


Article metrics loading...

Loading full text...

Full text loading...


  1. Alcaraz-Marmol, G.
    (2015) Dispersion and frequency: Is there any difference as regards their relation to L2 vocabulary gains?International Journal of English Studies, 15(2), 1–16. 10.6018/ijes/2015/2/201471
    https://doi.org/10.6018/ijes/2015/2/201471 [Google Scholar]
  2. Altmann, E. G., Pierrehumbert, J. B., & Motter, A. E.
    (2009) Beyond word frequency: Bursts, lulls, and scaling in the temporal distributions of words. PLOS one, 4(11), e7678. 10.1371/journal.pone.0007678
    https://doi.org/10.1371/journal.pone.0007678 [Google Scholar]
  3. Biber, D., Reppen, R., Schnur, E., & Ghanem, R.
    (2016) On the (non) utility of Juilland’s D to measure lexical dispersion in large corpora. International Journal of Corpus Linguistics, 21(4), 439–464. 10.1075/ijcl.21.4.01bib
    https://doi.org/10.1075/ijcl.21.4.01bib [Google Scholar]
  4. Brezina, V., & Gablasova, D.
    (2013) Is there a core general vocabulary? Introducing the New General Service List. Applied Linguistics, 36(1), 1–22. 10.1093/applin/amt018
    https://doi.org/10.1093/applin/amt018 [Google Scholar]
  5. Browne, C.
    (2014) A New General Service List: The better mousetrap we’ve been looking for. Vocabulary Learning and Instruction, 3(2), 1–10.
    [Google Scholar]
  6. Burch, B., Egbert, J., & Biber, D.
    (2017) Measuring and interpreting lexical dispersion in corpus linguistics. Journal of Research Design and Statistics in Linguistics and Communication Science, 3(2), 189–216. 10.1558/jrds.33066
    https://doi.org/10.1558/jrds.33066 [Google Scholar]
  7. Carroll, J. B.
    (1970) An alternative to Juilland’s Usage Coefficient for lexical frequencies. ETS Research Report Series 1970(2), 1–15.
    [Google Scholar]
  8. Carroll, J. B., Davies, P., & Richman, B.
    (1971) The American Heritage Word Frequency Book. Boston, MA: Houghton Mifflin.
    [Google Scholar]
  9. Coxhead, A.
    (2000) A new academic word list. TESOL Quarterly, 34(2), 213–238. 10.2307/3587951
    https://doi.org/10.2307/3587951 [Google Scholar]
  10. Coxhead, A., & Hirsch, D.
    (2007) A pilot science-specific word list. Revue Française de Linguistique Appliquée, 12(2), 65–78. 10.3917/rfla.122.0065
    https://doi.org/10.3917/rfla.122.0065 [Google Scholar]
  11. Dang, T. N. Y., Coxhead, A., & Webb, S.
    (2017) The academic spoken word list. Language Learning, 67(4), 959–997. 10.1111/lang.12253
    https://doi.org/10.1111/lang.12253 [Google Scholar]
  12. Davies, M., & Gardner, D.
    (2010) A Frequency Dictionary of Contemporary American English: Word Sketches, Collocates and Thematic Lists. London: Routledge.
    [Google Scholar]
  13. Francis, W. N., & Kucera, H.
    (1982) Frequency Analysis of English Usage: Lexicon and Grammar. Boston, MA: Houghton Mifflin.
    [Google Scholar]
  14. Gardner, D., & Davies, M.
    (2013) A new academic vocabulary list. Applied Linguistics, 35(3), 305–327. 10.1093/applin/amt015
    https://doi.org/10.1093/applin/amt015 [Google Scholar]
  15. Gries, S. Th.
    (2008) Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403–437. 10.1075/ijcl.13.4.02gri
    https://doi.org/10.1075/ijcl.13.4.02gri [Google Scholar]
  16. (2010) Dispersions and adjusted frequencies in corpora: Further explorations. InS. Th. Gries, S. Wulff, & M. Davies (Eds.), Corpus Linguistic Applications: Current Studies, New Directions (pp.197–212). Amsterdam: Rodopi. 10.1163/9789042028012_014
    https://doi.org/10.1163/9789042028012_014 [Google Scholar]
  17. Juilland, A. G., Brodin, D. R., & Davidovitch, C.
    (1970) Frequency Dictionary of French Words. The Hague: Mouton de Gruyter.
    [Google Scholar]
  18. Kilgarriff, A.
    (1996, June). Why chi-square doesn’t work, and an improved LOB-Brown comparison. Paper presented at theALLCACH Conference, Bergen, Norway.
    [Google Scholar]
  19. Leech, G., Rayson, P., & Wilson, A.
    (2001) Word Frequencies in Written and Spoken English: Based on the British National Corpus. London: Routledge.
    [Google Scholar]
  20. Lei, L., & Liu, D.
    (2016) A new medical academic word list: A corpus-based study with enhanced methodology. Journal of English for Academic Purposes, 22, 42–53. 10.1016/j.jeap.2016.01.008
    https://doi.org/10.1016/j.jeap.2016.01.008 [Google Scholar]
  21. Lijffijt, J., & Gries, S. Th.
    (2012) Correction to Stefan Th. Gries’ “Dispersions and adjusted frequencies in corpora”. International Journal of Corpus Linguistics. 17(1), 147–149. 10.1075/ijcl.17.1.08lij
    https://doi.org/10.1075/ijcl.17.1.08lij [Google Scholar]
  22. Lijffijt, J., Nevalainen, T., Säily, T., Papapetrou, P., Puolamäki, K., & Mannila, H.
    (2016) Significance testing of word frequencies in corpora. Digital Scholarship in the Humanities, 31(2), 374–397. 10.1093/llc/fqu064
    https://doi.org/10.1093/llc/fqu064 [Google Scholar]
  23. Liu, D.
    (2011) The most frequently used English phrasal verbs in American and British English: A multicorpus examination. TESOL Quarterly, 45(4), 661–688. 10.5054/tq.2011.247707
    https://doi.org/10.5054/tq.2011.247707 [Google Scholar]
  24. Matsushita, T.
    (2012) In what Order Should Learners Learn Japanese Vocabulary? A Corpus-based Approach (Unpublished doctoral dissertation). Victoria University of Wellington, Wellington, New Zealand.
  25. Nation, I. S. P.
    (2004) A study of the most frequent word families in the British National Corpus. InP. Bogaards & B. Laufer (Eds.), Vocabulary in A Second language: Selection, Acquisition, and Testing (pp.3–13). Amsterdam/Philadelphia, PA: John Benjamins. 10.1075/lllt.10.03nat
    https://doi.org/10.1075/lllt.10.03nat [Google Scholar]
  26. Paquot, M.
    (2007) Towards a productively-oriented academic word list. InWalinski, J., Kredens, K., & Gozdz-Roszkowski, S. (Eds.), Practical Applications in Language and Computers 2005 (pp.127–140). Frankfurt am Main: Peter Lang.
    [Google Scholar]
  27. Rosengren, I.
    (1972) Ein Frequenzwörterbuch der deutschen Zeitungssprache: Die Welt, Süddeutsche Zeitung [A Frequency Dictionary of German Newspaper Language: Die Welt, Süddeutsche Zeitung], Vol.2. Lund: GWK Gleerup.
  28. Savický, P., & Hlavácová, J.
    (2002) Measures of word commonness. Journal of Quantitative Linguistics, 9(3), 215–231. 10.1076/jqul.
    https://doi.org/10.1076/jqul. [Google Scholar]
  29. Wang, J., Liang, S. L., & Ge, G. C.
    (2008) Establishment of a medical academic word list. English for Specific Purposes, 27(4), 442–458. 10.1016/j.esp.2008.05.003
    https://doi.org/10.1016/j.esp.2008.05.003 [Google Scholar]
  30. Ward, J.
    (2009) A basic engineering English word list for less proficient foundation engineering undergraduates. English for Specific Purposes, 28(3), 170–182. 10.1016/j.esp.2009.04.001
    https://doi.org/10.1016/j.esp.2009.04.001 [Google Scholar]
  31. Wilcox, A. R.
    (1967) Indices of Qualitative Variation (Technical Report ORNLTM-1919). Oak Ridge National Laboratory. 10.2172/4167340
    https://doi.org/10.2172/4167340 [Google Scholar]
  32. (1973) Indices of qualitative variation and political measurement. Western Political Quarterly, 26(2), 325-343. 10.2307/446831
    https://doi.org/10.2307/446831 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): corpus design; DA; mode; text; word frequency lists
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error