- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 30, Issue 3, 2025
International Journal of Corpus Linguistics - Volume 30, Issue 3, 2025
Volume 30, Issue 3, 2025
-
Examining contextual constraints on the English dative alternation in L2 written production
pp.: 265–295 (31)More LessAbstractThis study examines how contextual factors influence the English dative alternation in written production by Chinese EFL learners, with native English usage serving as the benchmark for comparison. The dataset contained 2,492 tokens of the dative alternation (e.g. They give us books vs. They give books to us), extracted from a British English corpus and a Chinese learner English corpus, respectively. Seven probabilistic constraints of the constituents were annotated: length, complexity, pronominality, definiteness, animacy, person, and concreteness. Mixed-effects logistic regression analyses revealed substantial similarities in the core probabilistic grammar of the dative alternation between British English and Chinese learner English. However, the impact of pronominality, definiteness, and person differed. These findings suggest that L2 learners are capable of discerning contextual cues within implicit input and incorporating them into their own language usage. Nevertheless, the degree to which these cues can be acquired varies with input features and processing limitations.
-
Using machine learning to automate data annotation in corpus linguistics
Author(s): Lauren Fonteyn, Enrique Manjavacas and Jaleesa De Regtpp.: 296–315 (20)More LessAbstractA wealth of linguistic data has been annotated by corpus linguists, and this extant annotated data can be used to automatically replicate and apply the linguist’s annotation scheme by means of machine learning models. This paper accompanies the release of documented code notebooks, which allow corpus linguists to use manually categorized examples or ‘training data’ as input for a predictive language model. By means of a case study of Early Modern English -ing forms, we describe how the predictive language model MacBERTh can be used to accurately replicate the manual data classification scheme employed in previous corpus linguistic studies. Additionally, we discuss how manual error analysis and post-correction may help improve the model’s output. By openly releasing the data and code used in this paper, we hope to stimulate the use of machine learning models such as MacBERTh in corpus linguistics.
-
Interactive metadiscourse across languages and writer groups
Author(s): Heng Gong, Feng Cao and Lingling Liupp.: 316–351 (36)More LessAbstractAlthough English has become a lingua franca for academic publication, a growing number of multilingual scholars prefer to publish in both English and their first languages. This corpus-based study investigates how Chinese scholars in applied linguistics deploy interactive metadiscourse in their published Chinese and L2 English research articles, compared with those of L1 English writers. Both Chinese character(zì)-based and word(cí)-based units were used to segment and quantify the Chinese corpus, yielding two contrasting sets of results in the cross-linguistic comparisons. To ensure a conceptually equivalent comparison, we opted for word-based results, showing that the L1 Chinese corpus evidenced more frequent interactive metadiscoursal features than both the L1 and L2 English corpora. The latter two corpora, by contrast, revealed similar patterns of distribution. The divergences and convergences between Chinese and English corpora indicate linguacultural influences on interactive metadiscourse and reveal the methodological constraints on analysis of similar linguistic features.
-
Lexical Priming theory
Author(s): Alan Partington and Eugenia Diegolipp.: 352–375 (24)More LessAbstractThis paper is an early step in a wider project which, on the behest of the late Prof Michael Hoey, attempts to review the evolution of Lexical Priming (LP) theory since its first appearance in the early 2000s. Hoey’s later unpublished work was characterised by his desire that LP theory be tested on discourse-types beyond newspaper texts and in languages other than English. Here, we make a first attempt to test LP theory on Japanese data from a web corpus. Referring to Hoey’s own examples, to our Japanese data and to English data from the web (enTenTen21) and newspaper (SiBol) corpora, we suggest how evaluation theory might fruitfully and seamlessly be integrated into LP theory, and how textual primings are even more powerful than originally envisaged. We demonstrate how we are primed to produce and process texts into evaluative blocks so that they cohere evaluatively as well as propositionally.
-
Plunged into fuel poverty
Author(s): Leigh Harrington, Maria Fano Gonzalez and Kevin Frank Gerigkpp.: 376–416 (41)More LessAbstractFuel poverty, a household’s inability to achieve thermal comfort in line with a healthy standard of living at a reasonable cost, became an increasingly prevalent and visible socio-economic issue in the UK during the 2020–21 winter lockdowns. Using FuelPovertyPressUK, a specialised corpus of UK newspaper reporting, this paper is the first treatment of the discursive representation of fuel poverty as a distinct form of socio-economic inequality. We conduct a diachronic corpus-assisted discourse analysis, comparing Pre-COVID-19 Winter (September 2019–March 2020) and During-COVID-19 Winter (September 2020–March 2021) subcorpora. The findings demonstrate that newspaper reporting adequately reflected the increasing heterogeneity of the (new) fuel poor across COVID-19 and ultimately positioned the fuel poor as agentless. The paper highlights the value of corpus methods to linguistics-based and interdisciplinary poverty research and the challenge posed to corpus and discourse studies by analysing the representation of indeterminate and in-flux social groups.
-
Review of Gries (2024): Frequency, Dispersion, Association, and Keyness: Revising and tupleizing corpus-linguistic measures
Author(s): William C. X. Plattpp.: 417–424 (8)More LessThis article reviews Frequency, Dispersion, Association, and Keyness: Revising and tupleizing corpus-linguistic measures
-
Review of Landert (2024): Methods in Historical Corpus Pragmatics: Epistemic Stance in Early Modern English
Author(s): Lieselotte Bremspp.: 425–432 (8)More LessThis article reviews Methods in Historical Corpus Pragmatics: Epistemic Stance in Early Modern English
Volumes & issues
-
Volume 30 (2025)
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less