- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 27, Issue 1, 2022
International Journal of Corpus Linguistics - Volume 27, Issue 1, 2022
Volume 27, Issue 1, 2022
-
(The) fact is … /(Die) Tatsache ist … focaliser constructions in English and German are similar but subject to different constraints
Author(s): Marianne Hundt and Rahel Oppligerpp.: 1–30 (30)More LessAbstractN-is/ist constructions are elements in the left periphery of English/German sentences that have developed pragmatic meaning: they can be used as discourse markers with various functions, depending on the nominal element that is used in the construction. We use evidence from parallel and comparable corpora of English and German to investigate variable article use in these focaliser constructions and model factors that may play a role in article omission/retention (such as modification, choice of head noun, degree of syntactic integration of the focaliser). Our evidence shows that article use largely depends on the lexical head in German but is constrained by different factors in English (notably modification). We interpret our results against the backdrop of construction grammar, arguing that article omission plays a different role in the two languages. From a contrastive point of view, formal syntactic separation in English is easier to achieve than in German and thus facilitates use of English N-is constructions as focalisers.
-
Universals in machine translation?
pp.: 31–58 (28)More LessAbstractBy examining and comparing the linguistic patterns in a self-built corpus of Chinese-English translations produced by WeChat Translate, the latest online machine translation app from the most popular social media platform (WeChat) in China, this study explores such questions as whether or not and to what extent simplification and normalization (hypothesized Translation Universals) exhibit themselves in these translations. The results show that, whereas simplification cannot be substantiated, the tendency of normalization to occur in the WeChat translations can be confirmed. The research finds that these results are caused by the operating mechanism of machine translation (MT) systems. Certain salient words tend to prime WeChat’s MT system to repetitively resort to typical language patterns, which leads to a significant overuse of lexical chunks. It is hoped that the present study can shed new light on the development of MT systems and encourage more corpus-based product-oriented research on MT.
-
The syntax and semantics of coherence relations
Author(s): Ludivine Criblepp.: 59–92 (34)More LessAbstractThis corpus-based study investigates the inter-relation between discourse markers (DMs) and other contextual signals that contribute to the interpretation of coherence relations. The objectives are three-fold: (i) to provide a comprehensive and systematic portrait of the syntax and semantics of a set of coherence relations in English; (ii) to draw a distinction between mere tendencies of co-occurrence and strong predictive signals; (iii) to identify factors that account for the variation of these signals, focusing on relation complexity, DM strength and genre preferences. The methodology combines systematic coding (description) and multivariate statistical modelling (prediction). While the effect of genre and relation complexity was found to be null or moderate, the presence of discourse signals systematically varies with the ambiguity of the DM in the relation: signals co-occur more with ambiguous DMs than with more informative ones.
-
The Sociolinguistic Speech Corpus of Chilean Spanish (COSCACH)
Author(s): Scott Sadowskypp.: 93–125 (33)More LessAbstractThis paper presents the Sociolinguistic Speech Corpus of Chilean Spanish (COSCACH) v1.0, a 9.3-million-word corpus containing transcribed, lemmatized and morphologically tagged text, audio recordings and videos from 1,237 L1 speakers of Chilean Spanish, as well as a control sample of 21 non-Chilean L1 Spanish speakers. The COSCACH is the first freely available corpus of spoken Chilean Spanish of substantial size, as well as one of the largest speech corpora of any variety of Spanish. Following a review of other Chilean speech corpora, I describe how the COSCACH was constructed, covering corpus design, speaker recruitment and metadata collection, speech elicitation and recording, transcription, lemmatization and morphological tagging, and corpus compilation. I thereby aim to provide a blueprint for creating modern, large-scale speech corpora suitable for phonetic, sociophonetic and sociolinguistic research, in addition to traditional inquiry into semantics, lexis, grammar, pragmatics and discourse.
-
Review of Rüdiger & Dayter (2020): Corpus Approaches to Social Media
Author(s): Elen Le Follpp.: 126–132 (7)More LessThis article reviews Corpus Approaches to Social Media
-
Review of Čermáková & Malá (2021): Variation in Time and Space. Observing the World through Corpora
Author(s): Arja Nurmipp.: 133–138 (6)More LessThis article reviews Variation in Time and Space. Observing the World through Corpora
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less