Volume 5, Issue 2
  • ISSN 2215-1478
  • E-ISSN: 2215-1486



This paper introduces a new corpus resource for language learning research, the Trinity Lancaster Corpus (TLC), which contains 4.2 million words of interaction between L1 and L2 speakers of English. The corpus includes spoken production from over 2,000 L2 speakers from different linguistic and cultural backgrounds at different levels of proficiency engaged in two to four tasks. The paper provides a description of the TLC and places it in the context of current learner corpus development and research. The discussion of practical decisions taken in the construction of the TLC also enables a critical reflection on current methodological issues in corpus construction.

Available under the CC BY 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Adolphs, S. & Knight, D.
    2010 “Building a spoken corpus”. In A. O’Keeffe & M. McCarthy (Eds.), The Routledge Handbook of Corpus Linguistics. London: Routledge, 38–52. 10.4324/9780203856949.ch4
    https://doi.org/10.4324/9780203856949.ch4 [Google Scholar]
  2. Aijmer, K.
    2014 “Pragmatic markers”. In K. Aijmer & C. Rühlemann (Eds.), Corpus Pragmatics: A Handbook. Cambridge: Cambridge University Press, 195–218.
    [Google Scholar]
  3. Alexopoulou, T. , Michel, M. , Murakami, A. , & Meurers, D.
    2017 “Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques”. Language Learning67(S1), 180–208. 10.1111/lang.12232
    https://doi.org/10.1111/lang.12232 [Google Scholar]
  4. Arche, M. J.
    2008 SPLLOC Transcription Conventions. www.splloc.soton.ac.uk/trancon.html (accessedAugust 2019).
  5. Aston, G. , & Burnard, L.
    1998The BNC Handbook: Exploring the British National Corpus with SARA. Capstone.
    [Google Scholar]
  6. Baker, P. & Egbert, J.
    (Eds.) 2016Triangulating Methodological Approaches in Corpus Linguistic Research. London: Routledge. 10.4324/9781315724812
    https://doi.org/10.4324/9781315724812 [Google Scholar]
  7. Biber, D. & Conrad, S.
    2009Register, Genre, and Style. Cambridge: Cambridge University Press. 10.1017/CBO9780511814358
    https://doi.org/10.1017/CBO9780511814358 [Google Scholar]
  8. Breiteneder, A. , Pitzl, M. L. , Majewski, S. & Klimpfinger, T.
    2006 “VOICE recording-Methodological challenges in the compilation of a corpus of spoken ELF”. Nordic Journal of English Studies5(2), 161–187. 10.35360/njes.16
    https://doi.org/10.35360/njes.16 [Google Scholar]
  9. Brezina, V. & Meyerhoff, M.
    2014 “Significant or random. A critical review of sociolinguistic generalisations based on large corpora”. International Journal of Corpus Linguistics19(1), 1–28. 10.1075/ijcl.19.1.01bre
    https://doi.org/10.1075/ijcl.19.1.01bre [Google Scholar]
  10. Brezina, V.
    2018Statistics in Corpus Linguistics. A practical guide. Cambridge: Cambridge University Press. 10.1017/9781316410899
    https://doi.org/10.1017/9781316410899 [Google Scholar]
  11. Callies, M.
    2015 “Using learner corpora in language testing and assessment: Current practice and future challenges”. In E. Castello , K. Ackerley & F. Coccetta (Eds.), Studies in Learner Corpus Linguistics: Research and Applications for Foreign Language Teaching and Assessment. Frankfurt: Peter Lang, 21–35.
    [Google Scholar]
  12. Cameron, D.
    2001Working with Spoken Discourse. London: Sage.
    [Google Scholar]
  13. Carlsen, C.
    2012 “Proficiency level – A fuzzy variable in computer learner corpora”. Applied Linguistics33(2), 161–183. 10.1093/applin/amr047
    https://doi.org/10.1093/applin/amr047 [Google Scholar]
  14. Cervantes, I. M. & Gablasova, D.
    2017 “Phrasal verbs in spoken L2 English: The effect of L2 proficiency and L1 background”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 28–46.
    [Google Scholar]
  15. O’Connell, D. C. & Kowal, S.
    1994 “Some current transcription systems for spoken discourse: A critical analysis”. Pragmatics4(1), 81–107. 10.1075/prag.4.1.04con
    https://doi.org/10.1075/prag.4.1.04con [Google Scholar]
  16. Cook, G.
    1995 “Theoretical issues: transcribing the untranscribable”. In G. Leech , G. Myers & J. Thomas (Eds.), Spoken English on Computer: Transcription, Mark-up and Application. Harlow: Longman, 35–53.
    [Google Scholar]
  17. Crowdy, S.
    1994 “Spoken corpus transcription”. Literary and Linguistic Computing9(1), 25–28. 10.1093/llc/9.1.25
    https://doi.org/10.1093/llc/9.1.25 [Google Scholar]
  18. Dayrell, C. & Urry, J.
    2015 “Mediating climate politics: The surprising case of Brazil”. European Journal of Social Theory18(3), 257–273. 10.1177/1368431015579962
    https://doi.org/10.1177/1368431015579962 [Google Scholar]
  19. Du Bois, J. W.
    1991 “Transcription design principles for spoken discourse research”. Pragmatics1(1), 71–106. 10.1075/prag.1.1.04boi
    https://doi.org/10.1075/prag.1.1.04boi [Google Scholar]
  20. Ellis, N. C.
    2002 “Frequency effects in language processing. A review with implications for theories of implicit and explicit language acquisition”. Studies in Second Language Acquisition24(2), 143–188. 10.1017/S0272263102002024
    https://doi.org/10.1017/S0272263102002024 [Google Scholar]
  21. Fuchs, R. , Götz, S. & Werner, V.
    2016 “The present perfect in learner Englishes: A corpus-based case study on L1 German intermediate and advanced speech and writing”. In V. Werner , E. Seoane & C. Suárez-Gómez (Eds.), Re-Assessing the Present Perfect. Berlin: Mouton de Gruyter, 297–338. 10.1515/9783110443530‑013
    https://doi.org/10.1515/9783110443530-013 [Google Scholar]
  22. Gablasova, D. & Brezina, V.
    In preparation. Challenges in transcribing spoken learner language: Lessons from the Trinity Lancaster Corpus.
    [Google Scholar]
  23. Gablasova, D. , Brezina, V. & McEnery, T.
    2019 “The Trinity Lancaster Corpus: Applications in language teaching and materials development”. In S. Götz & J. Mukherjee (Eds.), Learner Corpora and Language Teaching. Amsterdam: John Benjamins, 8–28. 10.1075/scl.92.02gab
    https://doi.org/10.1075/scl.92.02gab [Google Scholar]
  24. 2017 “Exploring learner language through corpora: Comparing and interpreting corpus frequency information”. Language Learning67(S1), 130–154. 10.1111/lang.12226
    https://doi.org/10.1111/lang.12226 [Google Scholar]
  25. Gablasova, D. & Brezina, V.
    2017 “Disagreement in L2 spoken English: From learner corpus research to corpus-based teaching materials”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 69–89.
    [Google Scholar]
  26. Gablasova, D. , Brezina, V. , McEnery, T. & Boyd, E.
    2017 “Epistemic stance in spoken L2 English: The effect of task type and speaker style”. Applied Linguistics38(5), 613–637. 10.1093/applin/amv055
    https://doi.org/10.1093/applin/amv055 [Google Scholar]
  27. Gablasova, D. & Brezina, V.
    2015 “Does speaker role affect the choice of epistemic adverbials in L2 speech? Evidence from the Trinity Lancaster Corpus”. In J. Romero-Trillo (Ed.), Yearbook of Corpus Linguistics and Pragmatics 2015. Dordrecht: Springer, 117–136. 10.1007/978‑3‑319‑17948‑3_6
    https://doi.org/10.1007/978-3-319-17948-3_6 [Google Scholar]
  28. Gilquin, G. , De Cock, S. & Granger, S.
    2010The Louvain International Database of Spoken English Interlanguage. Handbook and CD-ROM. Louvain-la-Neuve: Presses universitaires de Louvain.
    [Google Scholar]
  29. Gilquin, G.
    2015 “From design to collection of learner corpora”. In S. Granger , G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 9–34. 10.1017/CBO9781139649414.002
    https://doi.org/10.1017/CBO9781139649414.002 [Google Scholar]
  30. Granger, S.
    2015 “Contrastive interlanguage analysis: A reappraisal”. International Journal of Learner Corpus Research1(1), 7–24. 10.1075/ijlcr.1.1.01gra
    https://doi.org/10.1075/ijlcr.1.1.01gra [Google Scholar]
  31. Gries, S. Th.
    2015 “Some current quantitative problems in corpus linguistics and a sketch of some solutions”. Language and Linguistics16(1), 93–117. 10.1177/1606822X14556606
    https://doi.org/10.1177/1606822X14556606 [Google Scholar]
  32. Jucker, A. H. , Smith, S. W. & Lüdge, T.
    2003 “Interactive aspects of vagueness in conversation”. Journal of Pragmatics35(12), 1737–1769. 10.1016/S0378‑2166(02)00188‑1
    https://doi.org/10.1016/S0378-2166(02)00188-1 [Google Scholar]
  33. Kormos, J.
    2014Speech production and second language acquisition. London: Routledge. 10.4324/9780203763964
    https://doi.org/10.4324/9780203763964 [Google Scholar]
  34. Leech, G.
    1998 “Preface. Learner corpora: what they are and what can be done with them”. In S. Granger (Ed.), Learner English on Computer. London: Longman, xiv–xx.
    [Google Scholar]
  35. 2000 “Grammars of spoken English: New outcomes of corpus-oriented research”. Language Learning50(4), 675–724. 10.1111/0023‑8333.00143
    https://doi.org/10.1111/0023-8333.00143 [Google Scholar]
  36. 2007 “New resources, or just better old ones? The Holy Grail of representativeness”. In M. Hundt , N. Nesselhauf & C. Biewer (Eds.), Corpus Linguistics and the Web. Amsterdam: Rodopi, 134–149. 10.1163/9789401203791_009
    https://doi.org/10.1163/9789401203791_009 [Google Scholar]
  37. Love, R. , Dembry, C. , Hardie, A. , Brezina, V. & McEnery, T.
    2017 “The spoken BNC2014”. International Journal of Corpus Linguistics22(3), 319–344. 10.1075/ijcl.22.3.02lov
    https://doi.org/10.1075/ijcl.22.3.02lov [Google Scholar]
  38. Mackey, A. , & Gass, S. M.
    2005Second Language Research: Methodology and Design. New York NY: Routledge.
    [Google Scholar]
  39. MacWhinney, B.
    2000The CHILDES Database: Tools for analyzing talk, 3rd edn.Mahwah NY: Lawrence Erlbaum Associates.
    [Google Scholar]
  40. McEnery, T. , Xiao, R. & Tono, Y.
    2006Corpus-based Language Studies: An Advanced Resource Book. London: Taylor & Francis.
    [Google Scholar]
  41. McEnery, T. & Hardie, A.
    2011Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press. 10.1017/CBO9780511981395
    https://doi.org/10.1017/CBO9780511981395 [Google Scholar]
  42. Muñoz, C.
    (Ed.) 2006Age and the Rate of Foreign Language Learning. Clevedon: Multilingual Matters. 10.21832/9781853598937
    https://doi.org/10.21832/9781853598937 [Google Scholar]
  43. Myles, F.
    2015 “Second language acquisition theory and learner corpus research”. In S. Granger , G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 309–332. 10.1017/CBO9781139649414.014
    https://doi.org/10.1017/CBO9781139649414.014 [Google Scholar]
  44. Ochs, E.
    1979 “Transcription as theory”. Developmental Pragmatics10(1), 43–72.
    [Google Scholar]
  45. Papageorgiou, S.
    2007Relating the Trinity College London GESE and ISE Examinations to the Common European Framework of Reference. Final project report, February 2007 London: Trinity College London.
    [Google Scholar]
  46. Plonsky, L.
    2016, February. The N crowd: Sampling practices, internal validity, and generalizability in L2 research. Presentation given atUniversity College London, London, UK.
    [Google Scholar]
  47. Porte, G.
    (Ed.) 2012Replication Research in Applied Linguistics. Cambridge: Cambridge University Press.
    [Google Scholar]
  48. Roever, C.
    2010 “Effects of cultural background in a test of ESL pragmalinguistics: A DIF approach”. In G. Kasper , H.t. Nguyen , D. R. Yoshimi & J. K. Yoshioka (Eds.), Pragmatics and Language Learning, Vol.12. Honolulu: National Foreign Language Resource Center, University of Hawai’i at Mānoa, 187–212.
    [Google Scholar]
  49. Semino, E. , Demjén, Z. , Demmen, J. , Koller, V. , Payne, S. , Hardie, A. , & Rayson, P.
    2017 “The online use of violence and journey metaphors by patients with cancer, as compared with health professionals: a mixed methods study”. BMJ Supportive & Palliative Care7(1), 60–66. 10.1136/bmjspcare‑2014‑000785
    https://doi.org/10.1136/bmjspcare-2014-000785 [Google Scholar]
  50. Sinclair, J.
    2005 “Corpus and text – basic principles”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 1–16.
    [Google Scholar]
  51. Spencer-Oatey, H.
    2008 “Introduction”. In H. Spencer-Oatey (Ed.), Culturally Speaking. Culture, Communication and Politeness Theory, 2nd edn.London: Bloomsbury, 1–8.
    [Google Scholar]
  52. Thompson, P.
    2005 “Spoken language corpora”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 59–70.
    [Google Scholar]
  53. Tomasello, M.
    2003 “Introduction: Some surprises for psychologist’s”. In M. Tomasello (Ed.), The New Psychology of Language. London: Taylor and Francis, 7–20. 10.4324/9781410606921
    https://doi.org/10.4324/9781410606921 [Google Scholar]
  54. Tracy-Ventura, N. & Myles, F.
    2015 “The importance of task variability in the design of learner corpora for SLA research”. International Journal of Learner Corpus Research1(1), 58–95. 10.1075/ijlcr.1.1.03tra
    https://doi.org/10.1075/ijlcr.1.1.03tra [Google Scholar]
  55. Trinity College London
    Trinity College London 2016Exam Information: Graded Examinations in Spoken English (GESE). Available at www.trinitycollege.com/site/?id=368
    [Google Scholar]
  56. Wall, D. , & C. Taylor
    2014 ‘Communicative Language Testing (CLT): Reflections on the “Issues Revisited” from the perspective of an examinations board.’ Language Assessment Quarterly11(2): 170–185. 10.1080/15434303.2014.902058
    https://doi.org/10.1080/15434303.2014.902058 [Google Scholar]
  57. Wong, D. & Kruger, H.
    2018 “ Yeah, yeah yeah or yeah no that’s right: A multifactorial analysis of the selection of backchannel structures in British English”. In V. Brezina , R. Love & K. Aijmer (Eds.), Corpus Approaches to Contemporary British Speech. London: Routledge, 120–156. 10.4324/9781315268323‑8
    https://doi.org/10.4324/9781315268323-8 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error