International Journal of Learner Corpus Research - Current Issue
Volume 12, Issue 1, 2026
-
The influence of L1 Dutch on connective use in L2 German academic writing : A contrastive corpus-based analysis
Author(s): Helena Wedig, Carola Strobl, Jim J. J. Ureel, Tanja Mortelmans and Larissa Weberpp.: 2–36 (35)show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:AbstractThe present study provides a comparative corpus-based analysis of summaries written by three groups: first-language (L1) German writers, second-language (L2) German writers with L1 Dutch, and L2 German writers with other L1s. The aim is to determine whether there are differences in connective use between L1 and L2 writers in summary writing and whether there are L1 Dutch-specific differences. The results show that L2 German writers with non-Dutch L1s use fewer connectives than L1 German writers, whereas L2 German writers with L1 Dutch use more connectives, especially expansion and contingency connectives. In addition, L2 German writers prefer certain connectives (e.g., und (and), weil (because)) and L2 German writers with L1 Dutch aber (but). Overall, this study highlights the importance of (contrastively) analysing summary writing as well as considering under-researched language pairs such as German and Dutch.
-
L2 phraseological use during an attrition period : The potential role of peak attainment and L2 exposure
Author(s): Amanda Edmonds and Aarnes Gudmestadpp.: 37–64 (28)show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:AbstractThis study explores if and how phraseological use patterns change over a five-year period for 14 learners of second-language (L2) Spanish. This period covers an academic year spent in a target-language environment, followed by a four-year attrition period. In addition to documenting potential change in usage patterns, we examine how peak attainment and continued L2 contact during the attrition period influence phraseological competence. The analysis focuses on one type of word combination, namely noun/adjective pairs, and measures change by looking at the frequency of noun/adjective sequences and the strength of the association between the two words. Results point to stability in phraseological competence, with no significant patterns of attrition being uncovered. These findings are interpreted against the backdrop of the small body of research on L2 lexical and, specifically, phraseological attrition, contributing to what is known about long-term learning trajectories in the lexical domain.
-
The relative complexity of alternation phenomena in spoken English as a Foreign Language
Author(s): Tanguy Dubois and Yixi Chenpp.: 65–93 (29)show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:AbstractWhile the assumption within theoretical linguistics is that alternation phenomena (e.g., “everyone” vs. “everybody”) lead to redundant complexity because they introduce multiple ways of saying the same thing, this assumption has not yet been tested empirically for EFL learners. To fill this gap, we collected thousands of spoken utterances from low-intermediate to advanced learners of English with either Spanish, Chinese or Italian as their L1 background from the Trinity Lancaster Corpus. Using mixed-effects Poisson regression, we analyzed whether the number of alternation phenomena correlates with the relative complexity of utterances, operationalized as being proportional to the number of disfluencies produced. Results indicate that, even for low-intermediate learners of English, alternation contexts do not induce more disfluencies, contrary to commonly held assumptions in theoretical linguistics and in line with similar research on L1 English speakers.
-
SEEFLEX : The Corpus of Secondary English as a Foreign Language (EFL) Exams
Author(s): Tobias Paulspp.: 94–119 (26)show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:AbstractThis report presents the Corpus of Secondary School English as a Foreign Language (EFL) Exams (SEEFLEX). In Germany, upper secondary school EFL exams feature recurring tasks targeting diverse text types. The SEEFLEX was developed to investigate how students complete these tasks linguistically and whether they meet the curricular requirements. The corpus contains data from 575 transcribed authentic curriculum-based examinations (1,979 texts, ~625.000 words). The metadata include standardized receptive vocabulary assessments, a cognition scale, the participants’ reading habits, social background, and their language experience and proficiency. Extensive xml mark-up was added to investigate the influence of inter alia source material, structural text features, and selected language mistakes. An online repository provides full-text access as well as ample additional resources, including an interactive Shiny application to investigate register variation in the corpus.
-
Automatic discourse segmentation of L1 and L2 spoken English transcripts
Author(s): Linsey C. Yang, Wenwei Dong, Nathan Vandeweerd and Jet Hoekpp.: 120–142 (23)show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:AbstractNatural language processing (NLP) tools, primarily trained on L1 written English, have achieved remarkable performance, but are rarely used in L2 learner data. This study leverages a rule-based segmenter to automatically segment spoken English discourse by both L1 speakers and learners, presenting novel preparatory data-cleaning steps that combine a state-of-the-art disfluency detector and additional rules to improve segmentation performance. In three successive segmentation tests on data from the Louvain Corpus of Native English Conversation (LOCNEC; De Cock, 2004) and the Louvain International Database of Spoken English Interlanguage (LINDSEI; Gilquin et al. 2010), we achieve an enhanced segmentation performance that is similar for both the L1 and L2 data (.84). Our approach highlights the effectiveness of leveraging existing NLP tools to process disfluent L2 spoken transcripts, facilitating automatic discourse analysis in Learner Corpus Research (LCR). The code for executing our pipeline is publicly available for future research.
Most Read This Month Most Read RSS feed
-
-
The Trinity Lancaster Corpus
Author(s): Dana Gablasova, Vaclav Brezina and Tony McEnery
-
- More Less