Volume 1, Issue 2
  • ISSN 2542-9477
  • E-ISSN: 2542-9485
Buy:$35.00 + Taxes



This study investigates the effect that reference corpora of different registers have on the content of keyword lists. The study focusses on two target corpora and the keyword lists generated for each when using three distinct reference corpora. The two target corpora consist of published research by faculty at two PhD-granting programs in applied linguistics in North America. The reference corpora comprise published research in applied linguistics, newspaper and magazine articles, and fiction texts, respectively. The findings suggest that while common keywords representing each target corpus emerge regardless of the reference corpus used in the analysis, there are also substantial differences. Primarily, using a reference corpus of the same sub-register as the target corpus better highlights content unique to each target corpus while using a reference corpus of a different register better uncovers words that reflect the register that the target corpora represent. Implications for conducting keyword analysis are discussed.


Article metrics loading...

Loading full text...

Full text loading...


  1. Anthony, L.
    (2018) AntConc (3.5.6) [Computer Software]. Tokyo, Japan: Waseda University. Available from www.laurenceanthony.net/
    [Google Scholar]
  2. Baker, P.
    (2004) Querying keywords: Questions of difference, frequency, and sense in keywords analysis. Journal of English Linguistics, 32(4), 346–359. 10.1177/0075424204269894
    https://doi.org/10.1177/0075424204269894 [Google Scholar]
  3. Biber, D.
    (1988) Variation across speech and writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  4. (1993) Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257. 10.1093/llc/8.4.243
    https://doi.org/10.1093/llc/8.4.243 [Google Scholar]
  5. Biber, D. & Conrad, S.
    (2009) Register, genre, and style. Cambridge: Cambridge University Press. 10.1017/CBO9780511814358
    https://doi.org/10.1017/CBO9780511814358 [Google Scholar]
  6. Biber, D. , & Gray, B.
    (2016) Grammatical complexity in academic English: Linguistic change in writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511920776
    https://doi.org/10.1017/CBO9780511920776 [Google Scholar]
  7. Biber, D. , Johansson, S. , Leech, G. , Conrad, S. , & Finegan, E.
    (1999) The Longman grammar of spoken and written English. London: Longman.
    [Google Scholar]
  8. Bondi, M.
    (2010) Perspectives on keywords and keyness: An introduction. In M. Bondi & M. Scott (Eds.), Keyness in texts (pp.1–18). Amsterdam: John Benjamins. 10.1075/scl.41.01bon
    https://doi.org/10.1075/scl.41.01bon [Google Scholar]
  9. Culpeper, J.
    (2009) Keyness: Words, parts-of-speech and semantic categories in the character-talk of Shakespeare’s Romeo and Juliet* . International Journal of Corpus Linguistics, 14(1), 29–59. doi:  10.1075/ijcl.14.1.03cul
    https://doi.org/10.1075/ijcl.14.1.03cul [Google Scholar]
  10. Davies, M.
    (2008–) The Corpus of Contemporary American English (COCA): 560 million words, 1990-present. Available online at https://corpus.byu.edu/coca/
    [Google Scholar]
  11. Egbert, J.
    (2007) Quality Analysis of Journals in TESOL and Applied Linguistics. TESOL Quarterly, 41(1), 157–171. doi:  10.1002/j.1545‑7249.2007.tb00044.x
    https://doi.org/10.1002/j.1545-7249.2007.tb00044.x [Google Scholar]
  12. Gabrielatos, C.
    (2018) Keyness analysis: Nature, metrics and techniques. In C. Taylor & A. Marchi (Eds.), Corpus approaches to discourse: A critical review (pp.225–258). New York, NY: Routledge. 10.4324/9781315179346‑11
    https://doi.org/10.4324/9781315179346-11 [Google Scholar]
  13. Gilmore, A. , & Millar, N.
    (2018) The language of civil engineering research articles: A corpus-based approach. English for Specific Purposes, 51, 1017. doi:  10.1016/j.esp.2018.02.002
    https://doi.org/10.1016/j.esp.2018.02.002 [Google Scholar]
  14. Gray, B.
    (2013) More than discipline: uncovering multi-dimensional patterns of variation in academic research articles. Corpora, 8(2), 153–181. doi:  10.3366/cor.2013.0039
    https://doi.org/10.3366/cor.2013.0039 [Google Scholar]
  15. (2015) Linguistic variation in research articles: When discipline tells only part of the story. Amsterdam: John Benjamins. 10.1075/scl.71
    https://doi.org/10.1075/scl.71 [Google Scholar]
  16. Gries, S. Th.
    (2008) Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403–437. doi:  10.1075/ijcl.13.4.02gri
    https://doi.org/10.1075/ijcl.13.4.02gri [Google Scholar]
  17. Hirch, R. , & Geluso, J.
    (2017, October). Capturing ‘aboutness’: Comparing and contrasting three methods of keyword analysis. Paper presented at Second Language Research Forum (SLRF), Ohio State University, Columbus, OH.
    [Google Scholar]
  18. Hyland, K. , & Jiang, F.
    (2018) “In this paper we suggest”: Changing patterns of disciplinary metadiscourse. English for Specific Purposes, 51, 18–30. doi:  10.1016/j.esp.2018.02.001
    https://doi.org/10.1016/j.esp.2018.02.001 [Google Scholar]
  19. Jones, E. , Oliphant, E. , & Peterson, P.
    (2001–) SciPy: Open Source Scientific Tools for Python. www.scipy.org/ (22August 2017).
  20. Keynes, J. M.
    (1936) The general theory of employment, interest, and money. New York, NY: Harcourt and Brace. E-text available from The University of Adelaide Library Electronic Texts Collection. https://ebooks.adelaide.edu.au/
    [Google Scholar]
  21. Lijffijt, J. , Nevalainen, T. , Säily , Papapetrou, P. , Puolamäki, K. , & Mannila, H.
    (2016) Significance testing of word frequencies in corpora. Digital Scholarship in the Humanities, 31(2), 374–397. doi:  10.1093/llc/fqu064
    https://doi.org/10.1093/llc/fqu064 [Google Scholar]
  22. Mahlberg, M.
    (2007) Clusters, key clusters and local textual functions. Corpora, 2(1), 1–31. 10.3366/cor.2007.2.1.1
    https://doi.org/10.3366/cor.2007.2.1.1 [Google Scholar]
  23. Mastropierro, L. , & Mahlberg, M.
    (2017) Key words and translated cohesion in Lovecraft’s At the Mountains of Madness and one of its Italian translations. English Text Construction, 10(1), 78–105. doi:  10.1075/etc.10.1.05mas
    https://doi.org/10.1075/etc.10.1.05mas [Google Scholar]
  24. Paquot, M. , & Bestgen, Y.
    (2009) Distinctive words in academic writing: A comparison of three statistical tests for keyword extraction. In A. Jucker , D. Schreier , & M. Hundt (Eds.), Corpora: Pragmatics and discourse (pp.247–269). Amsterdam: Rodopi. 10.1163/9789042029101_014
    https://doi.org/10.1163/9789042029101_014 [Google Scholar]
  25. Pojanapunya, P. , & Watson Todd, R.
    (2016) Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus Linguistics and Linguistic Theory, 14(1), 133–167. doi:  10.1515/cllt‑2015‑0030
    https://doi.org/10.1515/cllt-2015-0030 [Google Scholar]
  26. Rayson, P.
    (2008) Log-likelihood and effect size calculator. ucrel.lancs.ac.uk/llwizard.html (22December 2017).
  27. Scott, M.
    (2010) Problems in investigating keyness, or clearing the undergrowth and marking out trails…In M. Bondi & M. Scott (Eds.), Keyness in texts (pp.43–57). Amsterdam: John Benjamins. 10.1075/scl.41.04sco
    https://doi.org/10.1075/scl.41.04sco [Google Scholar]
  28. (2018) WordSmith Tools (Version 7.0) [Computer Software]. Oxford: Oxford University Press.
    [Google Scholar]
  29. (2018b) WordSmith Tools Manual (Version 7.0). Stroud, Gloucestershire: Mike Scott and Lexical Analysis Software. www.lexically.net/downloads/version7/HTML/index.html (15April 2018).
    [Google Scholar]
  30. Scott, M. & Tribble, S.
    (2006) Textual patterns: Key words and corpus analysis in language education. Amsterdam: John Benjamins. 10.1075/scl.22
    https://doi.org/10.1075/scl.22 [Google Scholar]
  31. Stubbs, M.
    (2010) Three concepts of keywords. In M. Bondi & M. Scott (Eds.), Keyness in texts (pp.21–42). Amsterdam: John Benjamins. 10.1075/scl.41.03stu
    https://doi.org/10.1075/scl.41.03stu [Google Scholar]
  32. Swales, J.
    (1990) Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.
    [Google Scholar]
  33. Upton, G. , & Cook, I.
    (2014) A dictionary of statistics (3rd ed.). Oxford: Oxford University Press.
    [Google Scholar]
  34. van Raan, A. F.
    (2005) Measuring science. In H. F. Moed , W. Glänzel , & U. Schmoch (Eds.), Handbook of quantitative science and technology research (pp.19–50). Dordrecht: Springer.
    [Google Scholar]
  35. Xiao, R. , & McEnery, A.
    (2005) Two approaches to genre analysis: Three genres in modern American English. Journal of English Linguistics, 33(1), 62–82. 10.1177/0075424204273957
    https://doi.org/10.1177/0075424204273957 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): keyword analysis; reference corpus; register; target corpus
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error