- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 12, Issue, 2007
International Journal of Corpus Linguistics - Volume 12, Issue 1, 2007
Volume 12, Issue 1, 2007
-
Through children’s eyes?: Corpus evidence of the features of children’s literature
Author(s): Paul Thompson and Alison Sealeypp.: 1–23 (23)More LessThis article reports on an analysis of a small corpus of fiction written for children, extracted from the BNC. Quantitative analyses of most frequent words and sequences of words, and of parts-of-speech, were conducted, and compared with their equivalents in two other sub-corpora of the BNC, of adult fiction and of newspaper texts. The main findings point to some characteristics of both the fiction corpora which are very similar, and which contrast markedly with the news texts. However, more nuanced comparison of concordance lines in which the frequent items occur reveal subtle but telling differences between their use in context in adult fiction and in fiction written for children.
-
Cognitive processes as evidence of the idiom principle
Author(s): Britt Ermanpp.: 25–53 (29)More LessThe study seeks to establish whether pause frequency and pause duration could inform us about the size of linguistic units stored in the mental lexicon. Pauses are seen as a reflection of cognitive effort in lexical retrieval. The basic assumption is that a particular concept starts activating related concepts in a conceptual network via spreading activation. Pausing is assumed to be rare when spreading activation is at work, i.e. in the recall of multiword, or prefabricated, structures. The results show that pausing was significantly more frequent in connection with lexical search in computed as compared to prefabricated structures, thus indicating that prefabricated structures are stored and retrieved as wholes. The most important implication of the study is that the results give further support to John Sinclair’s proposed ‘idiom principle’, according to which strings that would appear to be analyzable into segments nevertheless constitute single choices.
-
Part-of-speech ratios in English corpora
Author(s): Andrew Hardiepp.: 55–81 (27)More LessUsing part-of-speech (POS) tagged corpora, Hudson (1994) reports that approximately 37% of English tokens are nouns, where ‘noun’ is a superordinate category including nouns, pronouns and other word-classes. It is argued here that difficulties relating to the boundaries of Hudson’s ‘noun’ category demonstrate that there is no uncontroversial way to derive such a superordinate category from POS tagging. Decisions regarding the boundary of the ‘noun’ category have small but statistically significant effects on the ratio that emerges for ‘nouns’ as a whole. Tokenisation and categorisation differences between tagging schemes make it problematic to compare the ratio of ‘nouns’ across different tagsets. The precise figures for POS ratios are therefore effectively artefacts of the tagset. However, these objections to the use of POS ratios do not apply to their use as a metric of variation for comparing datasets tagged with the same tagging scheme.
-
Definite article usage before Last/Next Time in spoken and written American English
Author(s): Isaiah WonHo Yoopp.: 83–105 (23)More LessUnlike any other noun in English, time can combine with either Ø last/next or the last/next and maintain the same reference, e.g. (The) last time I saw her, she was still in grad school. Nontemporal nouns cannot combine with Ø last/next, while the references of temporal nouns change with the use of the before last/next, e.g. In 2001, he said he’d come back Ø next year (= in 2008) vs. In 2001, he said he’d come back the next year (= in 2002). Based on the analyses of tokens retrieved from both spoken and written corpora, this paper describes when and how often the combines with last/next time in American English. Defining temporal nouns as nouns that refer to specific periods or points of time, this paper also argues that, contrary to what other scholars have suggested (e.g. Larson 1985), time is not a temporal but quasi-temporal noun.
-
Constraints on multiple initial embedding of clauses
Author(s): Fred Karlssonpp.: 107–118 (12)More LessThe received view is that there are no constraints on clausal embedding complexity in sentences. This hypothesis will be challenged here on empirical grounds from the viewpoint of multiple initial embedding of clauses. The data come from the British National Corpus, Brown, LOB, and philological scholarship. The results extend to several other ‘Standard Average European’ (SAE) languages like Finnish, German, Latin, and Swedish. There is a precise quantitative constraint on the degree of initial clausal embedding, and that limit is two. In double initial embeddings, a qualitative constraint prescribes that typically the highest embedded clause is an if-clause. The lower embedded clause should be the sentential subject of the if-clause. Here is a real example of a maximally complex, prototypical, initial clausal embedding in mainstream SAE: [Main [Init–1 If [Init–2 what is tantamount to dictatorship …] continues in a union] it can …] (LOB). Multiple initial self-embeddings are prohibited.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less