Volume 25, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



This study explores the influence of corpus design when comparing lexical bundle use across groups, examining how the number of texts and average length of texts can impact conclusions about group differences. The study compares the use of lexical bundles by L1-English versus L2-English writers, based on analysis of two sub-corpora of academic articles that are matched for discipline, writer expertize, time of publication, and audience. However, the two sub-corpora differ with respect to the number of texts and the average length of texts. Three experiments examined the influence of differences in corpus composition. The results show that differences in the number of words and number of texts across sub-corpora can have a strong effect on claimed differences in bundle use across groups. This effect is found even when the texts in the corpora are closely matched for their register and topic.


Article metrics loading...

Loading full text...

Full text loading...


  1. Ädel, A., & Erman, B.
    (2012) Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for Specific Purposes, 31(2), 81–92. 10.1016/j.esp.2011.08.004
    https://doi.org/10.1016/j.esp.2011.08.004 [Google Scholar]
  2. Altenberg, B.
    (1998) On the phraseology of spoken English: The evidence of recurrent word-combinations. InA. Cowie (Ed.), Phraseology: Theory, Analysis and Applications (pp.101–122). Oxford University Press.
    [Google Scholar]
  3. Biber, D.
    (2006) University Language: A Corpus-based Study of Spoken and Written Registers. John Benjamins. 10.1075/scl.23
    https://doi.org/10.1075/scl.23 [Google Scholar]
  4. Biber, D., Conrad, S., & Cortes, V.
    (2004) If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), 371–405. 10.1093/applin/25.3.371
    https://doi.org/10.1093/applin/25.3.371 [Google Scholar]
  5. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E.
    (1999) Longman Grammar of Spoken and Written English. Pearson.
    [Google Scholar]
  6. Chen, Y.-H., & Baker, P.
    (2010) Lexical bundles in L1 and L2 academic writing. Language Learning and Technology, 14(2), 30–49.
    [Google Scholar]
  7. Ellis, N. C., & Simpson-Vlach, R.
    (2009) Formulaic language in native speakers: Triangulating psycholinguistics, corpus linguistics, and education. Corpus Linguistics and Linguistic Theory, 5, 61–78. 10.1515/CLLT.2009.003
    https://doi.org/10.1515/CLLT.2009.003 [Google Scholar]
  8. Ellis, N. C., Simpson-Vlach, R., & Maynard, C.
    (2008) Formulaic language in native and second-language speakers: Psycholinguistics, corpus linguistics, and TESOL. TESOL Quarterly, 41(3), 375–396. 10.1002/j.1545‑7249.2008.tb00137.x
    https://doi.org/10.1002/j.1545-7249.2008.tb00137.x [Google Scholar]
  9. Granger, S., & Paquot, M.
    (2008) Disentangling the phraseological web. InS. Granger & F. Meunier (Eds.), Phraseology: An Interdisciplinary Perspective. John Benjamins. 10.1075/z.139.07gra
    https://doi.org/10.1075/z.139.07gra [Google Scholar]
  10. Lu, X., Kisselev, O., Yoon, J., & Amory, M.
    (2018) Investigating effects of criterial consistency, the diversity dimension, and threshold variation in formulaic language research: Extending the methodological considerations of O’Donnell et al. (2013). International Journal of Corpus Linguistics, 23(2), 158–182. 10.1075/ijcl.16086.lu
    https://doi.org/10.1075/ijcl.16086.lu [Google Scholar]
  11. Mahlberg, M., Wiegand, V., Stockwell, P., & Hennessey, A.
    (2019) Speech bundles in the 19th-century English novel. Language and Literature, 28(4), 326–353. 10.1177/0963947019886754
    https://doi.org/10.1177/0963947019886754 [Google Scholar]
  12. Miller, D., & Biber, D.
    (2015) Evaluating reliability in quantitative vocabulary studies: The influence of corpus design and composition. International Journal of Corpus Linguistics, 20(1), 30–53. 10.1075/ijcl.20.1.02mil
    https://doi.org/10.1075/ijcl.20.1.02mil [Google Scholar]
  13. O’Donnell, M., Römer, U., &. Ellis, N. C.
    (2013) The development of formulaic sequences in first and second language writing: Investigating effects of frequency, association, and native norm. International Journal of Corpus Linguistics, 18(1): 83–108. 10.1075/ijcl.18.1.07odo
    https://doi.org/10.1075/ijcl.18.1.07odo [Google Scholar]
  14. Pan, F., Reppen, R., & Biber, D.
    (2016) Comparing patterns of L1 versus L2 English academic professionals: Lexical bundles in Telecommunications research journals. Journal of English for Academic Purposes, 21, 60–71. 10.1016/j.jeap.2015.11.003
    https://doi.org/10.1016/j.jeap.2015.11.003 [Google Scholar]
  15. Schmitt, N.
    (2004) Formulaic Sequences: Acquisition, Processing, and Use. John Benjamins. 10.1075/lllt.9
    https://doi.org/10.1075/lllt.9 [Google Scholar]
  16. Scott, M.
    (2015) WordSmith Tools (Version 6.0) [Computer software]. Lexical Analysis Software. https://lexically.net/wordsmith/downloads/
    [Google Scholar]
  17. Simpson-Vlach, R., & Ellis, N. C.
    (2010) An academic formulas list (AFL). Applied Linguistics, 31(4), 487–512. 10.1093/applin/amp058
    https://doi.org/10.1093/applin/amp058 [Google Scholar]
  18. Stubbs, M.
    (2007) An example of frequent English phraseology: Distribution, structures and functions. InR. Facchinetti (Ed.), Corpus Linguistics 25 Years on (pp.89–105). Rodopi. 10.1163/9789401204347_007
    https://doi.org/10.1163/9789401204347_007 [Google Scholar]
  19. Wray, A.
    (2002) Formulaic Language and the Lexicon. Cambridge University Press. 10.1017/CBO9780511519772
    https://doi.org/10.1017/CBO9780511519772 [Google Scholar]
  20. (2008) Formulaic Language: Pushing the Boundaries. Oxford University Press.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error