- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 27, Issue 2, 2022
International Journal of Corpus Linguistics - Volume 27, Issue 2, 2022
Volume 27, Issue 2, 2022
-
Verb form error detection in written English of Chinese EFL learners
Author(s): Gong Chen and Maocheng Liangpp.: 139–165 (27)More LessAbstractIn the past few decades, researchers have paid increasing attention to automatic error detection in natural languages, but few have focused on developing an error-checking tool for EFL learners in China. Based on the theory of Pattern Grammar, this study formalizes verb patterns through Link Grammar, a formal grammatical system developed by Sleator and Temperley (1991), and reconstructs an Link Grammar verb dictionary to create an automatic checking tool for verb form errors in Chinese learners’ written English. The test results show that by importing more detailed pattern information of verbs in the Link Grammar dictionary, the Link Grammar parser can identify verb form errors more accurately and effectively than the original and the parsing capability of the Link Grammar parser is improved. The article shows that Pattern Grammar and Link Grammar can work together and be applied to the construction of error-checking tools for EFL learners with promising results.
-
The hapax / type ratio
Author(s): Niek Van Wetterepp.: 166–190 (25)More LessAbstractThis article addresses one of the lesser-known productivity measures, namely the hapax / type ratio (HTR). Through a case study involving the Dutch semi-copula raken (“attain”), it is shown that the HTR more or less stabilizes from a certain sample size onwards. Moreover, this point of stabilization seems to coincide with an increased permanency of the hapaxes, i.e. the share of hapaxes that convert quickly to non-hapaxes is not as large as was the case at the beginning of the sampling process. Therefore, the stabilization of the HTR might be a good indicator of minimally required sample size in productivity studies, suggesting that the hapaxes are ‘non-incidental’ from this sample size onwards. However, I did not find a clear link between the onset of the stabilization of the HTR and the extent to which the inventory of types accounted for at the top of the frequency distribution is (quasi-)complete.
-
A multi-dimensional comparison of the effectiveness and efficiency of association measures in collocation extraction
Author(s): Yaochen Deng and Dilin Liupp.: 191–219 (29)More LessAbstractBecause of the ubiquity and importance of collocations in language use/learning, how to effectively and efficiently identify collocations has been a topic of interest. Although some studies have evaluated many of the existing association measures (AMs) used in the automatic identification of collocations, the results so far have been inconsistent and unclear due to various limitations of the existing studies. Hence, this study makes a multi-dimensional evaluation of the effectiveness and efficiency of seven major AMs in the identification of three types of collocations across five genres and seven corpora of different sizes. The results indicate that while a few AMs, such as Log Likelihood Ratio and Cubic Mutual Information (MI3), are consistently more effective and efficient than the other five AMs being examined, no one AM alone may be adequate in the identification of different types of collocations across different genres and corpus sizes. Research implications are also discussed.
-
Degrees of non-standardness
Author(s): Teodora Vuković, Anastasia Escher and Barbara Sonnenhauserpp.: 220–247 (28)More LessAbstractA corpus-based method for assessing a range of dialect-standard variation is presented for identifying samples exhibiting the highest prevalence of dialect features. This method provides insight into areal and inter-speaker variation and allows the extraction of maximally non-standard manifestations of the dialect, which may then be sampled and used for the study of language change and variation. The focus is on a non-standard Torlak variety, which has undergone considerable change under the influence of standard Serbian. The degree of variation is assessed by measuring the frequencies of five distinguishing linguistic features: accent position, dative reflexive si, auxiliary omission in the compound perfect, the post-positive article, and analytic case marking in the indirect object and possessive. Locations subject to the greatest and least influence of the standard are revealed using hierarchical clustering. A positive correlation between the frequencies of occurrence reveals which non-standard feature is the best predictor of the others.
-
Review of Stefanowitsch (2020): Corpus Linguistics: A Guide to the Methodology
Author(s): Kevin F Gerigkpp.: 248–253 (6)More LessThis article reviews Corpus Linguistics: A Guide to the Methodology
-
Review of Feng (2020): Form, Meaning and Function in Collocation: A Corpus Study on Commercial Chinese-to-English Translation
Author(s): Mehrdad Vasheghani Farahanipp.: 254–259 (6)More LessThis article reviews Form, Meaning and Function in Collocation: A Corpus Study on Commercial Chinese-to-English Translation
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less