- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 22, Issue, 2017
International Journal of Corpus Linguistics - Volume 22, Issue 4, 2017
Volume 22, Issue 4, 2017
-
The English Grammar Profile of learner competence
Author(s): Anne O’Keeffe and Geraldine Markpp.: 457–489 (33)More LessEnglish Profile (EP) is an ongoing empirical exploration of learner English initiated by Cambridge University Press and Cambridge English, among others. EP aims to create a set of empirically-based descriptions of language competencies for English. ‘Reference Level Descriptors’ already exist as part of the Common European Framework of Reference (CEFR) but are intuitively derived and not designed for one specific language. The English Grammar Profile (EGP, www.englishprofile.org/english-grammar-profile) is a sub-project of EP which aims to profile learner competence in grammar. This paper details the rationale for the study and the methodology that was developed to investigate the Cambridge Learner Corpus to arrive at over 1,200 grammatical competence statements. Key findings which link to existing corpus-based second language acquisition work are also presented.
-
A distributional semantic approach to the periodization of change in the productivity of constructions
Author(s): Florent Perek and Martin Hilpertpp.: 490–520 (31)More LessThis paper describes a method to automatically identify stages of language change in diachronic corpus data, combining variability-based neighbour clustering, which offers objective and reproducible criteria for periodization, and distributional semantics as a representation of lexical meaning. This method partitions the history of a grammatical construction according to qualitative stages of productivity corresponding to different semantic sets of lexical items attested in it. Two case studies are presented. The first case study on the hell-construction (“Verb the hell out of NP”) shows that the semantic development of a construction does not always match that of its quantitative aspects, like token or type frequency. The second case study on the way-construction compares the results of the present method with those of collostructional analysis. It is shown that the former measures semantic changes and their chronology with greater precision. In sum, this method offers a promising approach to exploring semantic variation in the lexical fillers of constructions and to modelling constructional change.
-
Association with explanation-conveying constructions predicts verbs’ implicit causality biases
Author(s): Emiel van den Hoven and Evelyn C. Ferstlpp.: 521–550 (30)More LessGiven a sentence such as Mary fascinated/admired Sue because she did great, the verb fascinated leads people to interpret she as referring to Mary, whereas admired leads people to interpret she as referring to Sue. This phenomenon is known as implicit causality (IC). Recent studies have shown that verbs’ causality biases closely correspond to the verbs’ semantic classes, as classified in VerbNet, a lexicon that groups verbs into classes on the basis of syntactic behavior. The current study further investigates the relationship between causality biases and semantic classes. Using corpus data we show that the collostruction strength between verbs and the syntactic constructions that VerbNet classes are based on can be a good predictor of causality bias. This result suggests that the relation between semantic class and causality bias is not a categorical matter; more typical members of the semantic class show a stronger causality bias than less typical members.
-
Multi-word discourse markers and their corpus-driven identification
Author(s): Kaja Dobrovoljcpp.: 551–582 (32)More LessWith expanding evidence on the formulaic nature of human communication, there is a growing need to extend discourse marker research to functionally analogue multi-word expressions. In contrast to the common qualitative approaches to discourse marker identification in corpora, this paper presents a corpus-driven semi-automatic approach to identification of multi-word discourse markers (MWDMs) in the reference corpus of spoken Slovene. Using eight statistical measures, we identified 173 structurally fixed discourse-marking MWEs, distinguished by a high number of tokens, a large proportion of grammatical words and semantic heterogeneity. This is a significantly longer list than would have been gained by manual inspection of smaller corpus samples. Although frequency-based methods produced satisfactory results, best precision in MWDM identification was achieved using the t-score association measure, while the overall poor performance of the mutual information suggests its inadequacy for extraction of MWDMs and other MWEs with similar lexical and distributional features.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less