- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 14, Issue, 2009
International Journal of Corpus Linguistics - Volume 14, Issue 1, 2009
Volume 14, Issue 1, 2009
-
Automatic measurement of syntactic complexity in child language acquisition
Author(s): Xiaofei Lupp.: 3–28 (26)More LessWe describe a heuristics-based system for automatic measurement of syntactic complexity using the revised Developmental Level (D-Level) scale (Rosenberg & Abbeduto 1987; Covington et al. 2006). The system takes a raw sentence as input and assigns it to an appropriate developmental level on the scale. The system is designed with child language acquisition and psycholinguistic research in mind, and is therefore developed and evaluated using both written data from the Penn Treebank (Marcus et al. 1993) and spoken child language acquisition data from the CHILDES database (MacWhinney 2000). Experiment results show that the model achieves an accuracy of 94.0% and 93.2% on unseen test data from the Penn Treebank and the CHILDES database respectively. We illustrate how the system is used in an example application to investigate the correlation of average D-Level score and speaker age.
-
Keyness: Words, parts-of-speech and semantic categories in the character-talk of Shakespeare’s Romeo and Juliet
Author(s): Jonathan Culpeperpp.: 29–59 (31)More LessThis paper explores keywords, key part-of-speech categories and key semantic categories and their role in text analysis. The first part of the paper addresses a set of issues relating to the definition of keywords and their history, the settings used in deriving keywords, the choice of reference corpora, the different kinds of keyword that emerge in one’s results and the dispersion of keywords in one’s data. It argues, amongst other things, that keywords are the same as style markers, and that three types of keyword can be identified: interpersonal, textual and ideational. The second part of the paper addresses the question of what precisely is to be gained from analysing key part-of-speech or key semantic domains in addition to keywords. It shows that whilst in general they add little to a keyword analysis, which is in any case methodologically more robust, there are some significant specific benefits. Answers to many of the questions posed in this paper are illustrated by a study of character-talk from Shakespeare’s play Romeo and Juliet, and in this way this paper also makes a contribution to the fledging field of corpus stylistics.
-
Gei constructions in Mandarin Chinese and bei constructions in Cantonese: A corpus-driven contrastive study
Author(s): May L-Y Wongpp.: 60–80 (21)More LessThis paper examines the use of gei constructions in Mandarin Chinese and bei constructions in Cantonese within three corpora (of spoken and written Chinese and Hong Kong Cantonese). There are seven structural patterns in which gei/VLgei takes two objects. The order of these objects is determined by the principle of end-weight. Another four structural patterns see the co-occurrence of verb phrases with gei/V-gei. About four percent of gei constructions are used to mark a passivised verb. The study also reveals that the fronting of direct object marked by the preposition ba is a rather formal style. In the contrast between Mandarin gei constructions and Cantonese bei constructions, it was found that (i) the order of indirect object followed by direct object as in Mandarin Chinese reverses in Cantonese; (ii) when compared with Mandarin gei, Cantonese bei is more commonly used as a passive marker and as a verb meaning ‘allow’.
-
Where do we backchannel?: On the use of mm, mhm, uh huh and such like
Author(s): Göran Kjellmerpp.: 81–112 (32)More LessThe paper investigates a sample of ‘backchannels’, a kind of response item, in the Cobuild Corpus. Its object is to chart the occurrence of backchannels in modern English speech, and especially to find out if they can indicate how much of a language sequence is needed for a listener to understand the intended message. The sequences into which backchannels are inserted and their insertion points are therefore classified, and the fairly numerous sequences where backchannels “interrupt” a linguistic unit are singled out for special study. A general conclusion is that in the cases where there is no explicit information about the part of the message following the inserted backchannel, the message will nevertheless mostly be understood even at the backchannel insertion point. A comparison between male and female speakers shows that women use backchannels more than men and that, unlike men, they prefer unemphatic backchannels.
-
Spoken Corpora Design: Their Constitutive Parameters
Author(s): Frantiek Čermákpp.: 113–123 (11)More LessFrom a linguistic point of view, spoken corpora should be primary for research but that has not been the case so far. Hence, the problem of what should be included in the corpora has hardly ever been considered. Often it would appear that anything spoken is included on an ad hoc basis. The need and scarcity of real prototypical spoken corpora points to a necessity of mapping the field in its entirety and identifying its relevant parameters. In order to do this the present paper translates the major differences between spoken and written texts into usable parameters. Ultimately this could enable the setting up of a representative spoken corpus with a clear core of real and typical spoken language, etc.
Volumes & issues
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false

-
-
Comparing Corpora
Author(s): Adam Kilgarriff
-
- More Less