Volume 25, Issue 1
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



Vocabulary load is a predictor of comprehension and a common concern in relation to learner use of concordances; however, vocabulary load figures for whole texts have limited relevance to learner use of concordances. This paper explores the average vocabulary load of the citations (or lines) in a concordance, reflecting how learners use concordances as reading or reference resources. Non-parametric tests are used to compare the vocabulary loads of citations from three authentic written corpora and a corpus of graded readers. The results indicate that citations from authentic corpora have an average vocabulary load of 4,000–5,000 word families, there are reliable differences in vocabulary load between citations from different corpora, and the magnitude of difference between citations from authentic corpora can be equivalent to the magnitude of difference between authentic corpora and graded reader corpora. The paper concludes with a discussion of the results in relation to language learner use of concordances.


Article metrics loading...

Loading full text...

Full text loading...


  1. Allan, R.
    (2009) Can a graded reader corpus provide ‘authentic’ input?ELT Journal, 63(1), 23–32. 10.1093/elt/ccn011
    https://doi.org/10.1093/elt/ccn011 [Google Scholar]
  2. (2010) Concordances versus dictionaries: Evaluating approaches to word learning in ESOL. InR. Chacón-Beltrán, C. Abello-Contesse, & M. D. M. Torreblanca-López (Eds.), Insights into Non-native Vocabulary Teaching and Learning (pp.112–125). Bristol: Multilingual Matters. 10.21832/9781847692900‑009
    https://doi.org/10.21832/9781847692900-009 [Google Scholar]
  3. Baayen, R. H.
    (2001) Word Frequency Distributions. Dordrecht: Kluwer Academic. 10.1007/978‑94‑010‑0844‑0
    https://doi.org/10.1007/978-94-010-0844-0 [Google Scholar]
  4. Ballance, O. J.
    (2017) Pedagogical models of concordance use: Correlations between concordance user preferences. Computer Assisted Language Learning, 30(3–4), 259–283. 10.1080/09588221.2017.1307228
    https://doi.org/10.1080/09588221.2017.1307228 [Google Scholar]
  5. Bauer, L., & Nation, I. S. P.
    (1993) Word families. International Journal of Lexicography, 6(4), 253–279. 10.1093/ijl/6.4.253
    https://doi.org/10.1093/ijl/6.4.253 [Google Scholar]
  6. Bernardini, S.
    (2000) Systematising serendipity: Proposals for concordancing large corpora with language learners. InL. Burnard & T. McEnery (Eds.), Rethinking Language Pedagogy from a Corpus Perspective (pp.225–235). Frankfurt am Main: Peter Lang.
    [Google Scholar]
  7. (2002) Exploring new directions for discovery learning. InB. Kettemann & G. Marko (Eds.), Teaching and Learning by Doing Corpus Analysis (pp.165–182). Amsterdam: Rodopi. 10.1163/9789004334236_015
    https://doi.org/10.1163/9789004334236_015 [Google Scholar]
  8. (2004) Corpora in the classroom: An overview and some reflections on future developments. InJ. Sinclair (Ed.), How to Use Corpora in Language Teaching. Amsterdam: John Benjamins. 10.1075/scl.12.05ber
    https://doi.org/10.1075/scl.12.05ber [Google Scholar]
  9. Biber, D., Conrad, S., & Reppen, R.
    (1998) Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press. 10.1017/CBO9780511804489
    https://doi.org/10.1017/CBO9780511804489 [Google Scholar]
  10. BNC-Consortium
    BNC-Consortium (2001) The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services. Retrieved fromwww.natcorp.ox.ac.uk/getting/index.xml (last acccessedNovember 2019).
    [Google Scholar]
  11. Boulton, A., & Cobb, T.
    (2017) Corpus use in language learning: A meta-analysis. Language Learning, 67(2), 348–393. 10.1111/lang.12224
    https://doi.org/10.1111/lang.12224 [Google Scholar]
  12. Chambers, A., & O’Sullivan, I.
    (2004) Corpus consultation and advanced learners: Writing skills in French. ReCALL, 16(1), 158–172. 10.1017/S0958344004001211
    https://doi.org/10.1017/S0958344004001211 [Google Scholar]
  13. Charles, M.
    (2011) Using hands-on concordancing to teach rhetorical functions: Evaluation and implications for EAP writing classes. InA. Frankenberg-Garcia, L. Flowerdew, & G. Aston (Eds.), New Trends in Corpora and Language Learning (pp.26–43). London: Continuum.
    [Google Scholar]
  14. Chujo, K., Oghigian, K., & Akasegawa, S.
    (2015) A corpus and grammatical browsing system for remedial EFL learners. InA. Leńko-Szymańska & A. Boulton (Eds.), Multiple Affordances of Language Corpora for Data-driven Learning (pp.109–128). Amsterdam: John Benjamins.
    [Google Scholar]
  15. Cobb, T.
    (1997) Is there any measurable learning from hands-on concordancing?System, 25(3), 301–315. 10.1016/S0346‑251X(97)00024‑9
    https://doi.org/10.1016/S0346-251X(97)00024-9 [Google Scholar]
  16. (1999) Breadth and depth of lexical acquisition with hands-on concordancing. Computer Assisted Language Learning, 12(4), 345–360. 10.1076/call.12.4.345.5699
    https://doi.org/10.1076/call.12.4.345.5699 [Google Scholar]
  17. (n.d.). Graded Reader Corpus. Retrieved fromwww.lextutor.ca/conc/graded/ (last acccessedNovember 2019).
    [Google Scholar]
  18. Coxhead, A., & Ballance, O. J.
    (2018) Learning through a corpus. InA. Burns & J. C. Richards (Eds.), The Cambridge Guide to Learning English as a Second Language (pp.307–315). Cambridge: Cambridge University Press.
    [Google Scholar]
  19. Coxhead, A., Demecheleer, M., & McLaughlin, E.
    (2016) The technical vocabulary of Carpentry: Loads, lists and bearings. TESOLANZ Journal, 24, 38–71.
    [Google Scholar]
  20. Coxhead, A. & Wallis, R.
    (2012) TED talks, vocabulary and listening for EAP. TESOLANZ Journal, 20, 55–67.
    [Google Scholar]
  21. Dang, T. N. Y., & Webb, S.
    (2014) The lexical profile of academic spoken English. English for Specific Purposes, 33, 66–76. 10.1016/j.esp.2013.08.001
    https://doi.org/10.1016/j.esp.2013.08.001 [Google Scholar]
  22. Davies, M.
    (2008–) The Corpus of Contemporary American English (COCA): 520 millions words, 1990-present. Retrieved fromhttps://www.english-corpora.org/coca/ (last acccessedNovember 2019).
    [Google Scholar]
  23. Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A.
    (2007) G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. 10.3758/BF03193146
    https://doi.org/10.3758/BF03193146 [Google Scholar]
  24. Franken, M.
    (2014) The nature and scope of student search strategies in using a web derived corpus for writing. The Language Learning Journal, 42(1), 85–102. 10.1080/09571736.2012.678013
    https://doi.org/10.1080/09571736.2012.678013 [Google Scholar]
  25. Frankenberg-Garcia, A.
    (2014) How language learners can benefit from corpora, or not. Recherches en didatique des langues et des cultures: les cahiers de l’acedle, 11(1), 93–110.
    [Google Scholar]
  26. Grabe, W., & Stoller, F. L.
    (2011) Teaching and Researching Reading (2nd ed.). Harlow: Longman/Pearson.
    [Google Scholar]
  27. Hadley, G., & Charles, M.
    (2017) Enhancing extensive reading with data-driven learning. Language Learning & Technology, 21(3), 131–152.
    [Google Scholar]
  28. Hsu, W.
    (2011) The vocabulary thresholds of business textbooks and business research articles for EFL learners. English for Specific Purposes, 30(4), 247–257. 10.1016/j.esp.2011.04.005
    https://doi.org/10.1016/j.esp.2011.04.005 [Google Scholar]
  29. (2014) Measuring the vocabulary load of engineering textbooks for EFL undergraduates. English for Specific Purposes, 33, 54–65. 10.1016/j.esp.2013.07.001
    https://doi.org/10.1016/j.esp.2013.07.001 [Google Scholar]
  30. Hu, M., & Nation, I. S. P.
    (2000) Unknown vocabulary density and reading. Reading in a Foreign Language, 13(1), 403–430.
    [Google Scholar]
  31. Hyland, K.
    (2015) Corpora and written academic English. InD. Biber & R. Reppen (Eds.), The Cambridge Handbook of English Corpus Linguistics (pp.292–308). Cambridge, UK: Cambridge University Press. 10.1017/CBO9781139764377.017
    https://doi.org/10.1017/CBO9781139764377.017 [Google Scholar]
  32. Johns, T.
    (1991) Should you be persuaded: Two samples of data-driven learning materials. English Language Research Journal, 4, 1–16.
    [Google Scholar]
  33. (2002) Data-driven learning: The perpetual challenge. InB. Kettemann & G. Marko (Eds.), Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz19–24July 2000 (pp.107–117). Amsterdam: Rodopi. 10.1163/9789004334236_010
    https://doi.org/10.1163/9789004334236_010 [Google Scholar]
  34. Kennedy, C., & Miceli, T.
    (2001) An evaluation of intermediate students’ approaches to corpus investigation. Language Learning and Technology, 5(3), 77–90.
    [Google Scholar]
  35. (2010) Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource. Language Learning & Technology, 14(1), 28–44.
    [Google Scholar]
  36. (2016) Cultivating effective corpus use by language learners. Computer Assisted Language Learning, 30(1–2), 1–24.
    [Google Scholar]
  37. Kennedy, G.
    (1998) An Introduction to Corpus Linguistics. London, UK: Longman.
    [Google Scholar]
  38. Kilgarriff, A., Husák, M., McAdam, K., Rundell, M., & Rychlý, P.
    (2008, 15–19July). GDEX: Automatically finding good dictionary examples in a corpus. Paper presented at the13th EURALEX, Barcelona, Spain.
    [Google Scholar]
  39. Kilgarriff, A., Marcowitz, F., Smith, S., & Thomas, J.
    (2015) Corpora and language learning with the Sketch Engine and SKELL. Revue française de linguistique appliquée, 20(1), 61–80. 10.3917/rfla.201.0061
    https://doi.org/10.3917/rfla.201.0061 [Google Scholar]
  40. Larson-Hall, J.
    (2010) A Guide to Doing Statistics in Second Language Research Using SPSS. New York, NY: Routledge.
    [Google Scholar]
  41. Laufer, B., & Ravenhorst-Kalovski, G.
    (2010) Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30.
    [Google Scholar]
  42. Lee, D.
    (2002) Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle. InB. Kettemann & G. Marko (Eds.), Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz 19–24 July, 2000 (pp.247–292). Amsterdam: Rodopi. 10.1163/9789004334236_021
    https://doi.org/10.1163/9789004334236_021 [Google Scholar]
  43. Lee, H., Warschauer, M., & Lee, J. H.
    (2018) The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics, 40(5), 721–753. 10.1093/applin/amy012
    https://doi.org/10.1093/applin/amy012 [Google Scholar]
  44. Nation, I. S. P.
    (2006) How large a vocabulary is needed for reading and listening?The Canadian Modern Language Review / La revue canadienne des langues vivantes, 63(1), 59–81. 10.3138/cmlr.63.1.59
    https://doi.org/10.3138/cmlr.63.1.59 [Google Scholar]
  45. (2012) Range program with BNC/COCA lists 25,000 words. Retrieved fromhttps://www.victoria.ac.nz/lals/about/staff/paul-nation (last acccessedNovember 2019).
  46. (2013) Learning Vocabulary in Another Language (2nd ed.). Cambridge: Cambridge University Press. 10.1017/CBO9781139858656
    https://doi.org/10.1017/CBO9781139858656 [Google Scholar]
  47. (2016) Making and Using Word Lists for Language Learning and Testing. Amsterdam: John Benjamins. 10.1075/z.208
    https://doi.org/10.1075/z.208 [Google Scholar]
  48. Nation, I. S. P., & Webb, S.
    (2011) Researching and Analyzing Vocabulary. Boston, MA: Heinle.
    [Google Scholar]
  49. Python Software Foundation
    Python Software Foundation (2001–2019) Python (Version 2.7) [Computer software]. Retrieved fromhttps://www.python.org/ (last accessedNovember 2019).
    [Google Scholar]
  50. Rayner, K.
    (1998) Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372–422. 10.1037/0033‑2909.124.3.372
    https://doi.org/10.1037/0033-2909.124.3.372 [Google Scholar]
  51. (2009) Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology, 62(8), 1457–1506. 10.1080/17470210902816461
    https://doi.org/10.1080/17470210902816461 [Google Scholar]
  52. Rayson, P.
    (2015) Computational tools and methods for corpus compilation and analysis. InD. Biber & R. Reppen (Eds.), The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press. 10.1017/CBO9781139764377.003
    https://doi.org/10.1017/CBO9781139764377.003 [Google Scholar]
  53. Schmitt, N., Jiang, X., & Grabe, W.
    (2011) The percentage of words known in a text and reading comprehension. The Modern Language Journal, 95(1), 26–43. 10.1111/j.1540‑4781.2011.01146.x
    https://doi.org/10.1111/j.1540-4781.2011.01146.x [Google Scholar]
  54. Sinclair, J.
    (2003) Reading Concordances: An Introdcution. London: Pearson/Longman.
    [Google Scholar]
  55. Sorell, J.
    (2015) Word frequencies. InJ. R. Taylor (Ed.), The Oxford Handbook of the Word (pp. 68–88). Oxford: Oxford University Press.
    [Google Scholar]
  56. Swan, M., & Walter, C.
    (2017) Misunderstanding comprehension. ELT Journal, 71(2), 228–236.
    [Google Scholar]
  57. Tegge, F.
    (2017) The lexical coverage of popular songs in English language teaching. System, 67, 87–98. 10.1016/j.system.2017.04.016
    https://doi.org/10.1016/j.system.2017.04.016 [Google Scholar]
  58. Tognini-Bonelli, E.
    (2001) Corpus Linguistics at Work. Amsterdam: John Benjamins. 10.1075/scl.6
    https://doi.org/10.1075/scl.6 [Google Scholar]
  59. Tono, Y., Satake, Y., & Miura, A.
    (2014) The effects of using corpora on revision tasks in L2 writing with coded error feedback. ReCALL, 26(2), 147–162. 10.1017/S095834401400007X
    https://doi.org/10.1017/S095834401400007X [Google Scholar]
  60. Webb, S., & Macalister, J.
    (2013) Is text written for children useful for L2 extensive reading?TESOL Quarterly, 47(2), 300–322. 10.1002/tesq.70
    https://doi.org/10.1002/tesq.70 [Google Scholar]
  61. Webb, S., & Rodgers, M.
    (2009a) The lexical coverage of movies. Applied Linguistics, 30(3), 407–427. 10.1093/applin/amp010
    https://doi.org/10.1093/applin/amp010 [Google Scholar]
  62. (2009b) Vocabulary demands of television programs. Language Learning, 59(2), 335–366. 10.1111/j.1467‑9922.2009.00509.x
    https://doi.org/10.1111/j.1467-9922.2009.00509.x [Google Scholar]
  63. Webb, S., Sasao, Y., & Ballance, O.
    (2017) The updated Vocabulary Levels Test: Developing and validating two new forms of the VLT. ITL – International Journal of Applied Linguistics, 168(1), 33–69. 10.1075/itl.168.1.02web
    https://doi.org/10.1075/itl.168.1.02web [Google Scholar]
  64. Wible, D., Chien, F.-Y., Kuo, C.-H., & Wang, C. C.
    (2002) A lexical difficulty filter for language learners. InB. Kettemann & G. Marko (Eds.), Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz 19–24 July, 2000 (pp.147–154). Amsterdam: Rodopi. 10.1163/9789004334236_013
    https://doi.org/10.1163/9789004334236_013 [Google Scholar]
  65. Widdowson, H. G.
    (1998) Context, community, and authentic language. TESOL Quarterly, 32(4), 705–716. 10.2307/3588001
    https://doi.org/10.2307/3588001 [Google Scholar]
  66. Yoon, H.
    (2008) More than a linguistic reference: The influence of corpus technology on L2 academic writing. Language Learning & Technology, 12(2), 31–48.
    [Google Scholar]
  67. Yoon, H., & Hirvela, A.
    (2004) ESL student attitudes toward corpus use in L2 writing. Journal of Second Language Writing, 13(4), 257–283. 10.1016/j.jslw.2004.06.002
    https://doi.org/10.1016/j.jslw.2004.06.002 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): authentic texts; citations; concordancing; graded readers; vocabulary load
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error