- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 7, Issue, 2002
International Journal of Corpus Linguistics - Volume 7, Issue 2, 2002
Volume 7, Issue 2, 2002
-
HE and THEY in indefinite anaphora in written present-day English
Author(s): Mikko Laitinenpp.: 137–164 (28)More LessThis article explores the usage of singular HE and plural THEY with their possessive, objective and reflexive forms in anaphoric reference to compound indefinite pronouns in written present-day English. Previous studies have indicated that the most commonly used personal pronouns in anaphoric reference to non-referential indefinite pronouns are indeed HE and THEY. The data for the study are drawn from the written part of the British National Corpus. The structure of the study is such that following the introduction, I will survey the earlier literature on the topic to illustrate that there is a gap in the previous studies on epicene pronouns. The third section defines the indefinite pronouns used in this study. In addition, the section also discusses the differences between the meaning and form of the indefinites and the semantic reference sets of each pronoun paradigm. Following the explanation of the methods, the article sets out the findings.
-
A corpus-based study of connectors in student writing: Research from the International Corpus of English in Hong Kong (ICE-HK)
Author(s): Kingsley Bolton, Gerald Nelson and Joseph Hungpp.: 165–182 (18)More LessThis paper focuses on connector usage in the writing of university students in Hong Kong and in Great Britain, and presents results based on the comparison of data from the Hong Kong component (ICE-HK) and the British component (ICE-GB) of the International Corpus of English (ICE). While previous studies of Hong Kong student writing have dealt with the ‘underuse’, ‘overuse’, and ‘misuse’ of connectors, this study confines itself to the analysis of underuse and overuse, and is especially concerned with methodological issues relating to the accurate measurement of these concepts. Specifically, it takes as its benchmark of overuse and underuse the frequency of connectors in professional academic writing, in this case the data in the ICE-GB corpus. The results show that measured in this way, both groups of students – native speakers and non-native speakers alike – overuse a wide range of connectors. The results offer no evidence of significant underuse.
-
Automatic retrieval of syntactic structures: The quest for the Holy Grail
Author(s): Gaëtanelle Gilquinpp.: 183–214 (32)More LessThe study of complex grammatical patterns tends to be neglected by corpus linguists, the main reason being that such phenomena are much more difficult to extract from a corpus than simple words or tags. I demonstrate in this article that, although the desirable parsed corpora and appropriate software are not always available, the retrieval of syntactic structures can be automated to a certain extent. A number of corpus-based grammatical analyses, as well as a pilot study of causative structures with make, illustrate the various alternative strategies that can be used to this effect.
-
Two quantitative methods of studying phraseology in English
Author(s): Michael Stubbspp.: 215–244 (30)More LessWord frequency lists are a standard resource for many theoretical, descriptive and applied questions. However, due to severe problems of definition, there are no equivalent lists which give the frequency of phrases. This paper proposes two independent methods of studying the frequent phraseology of English. First, using a data-base of the most frequent collocations between word-forms in a 200-million word corpus, the strength of attraction between pairs of content words is discussed. Second, using a corpus of 2.5 million words, some of the most frequent phrases, in the sense of strings of uninterrupted word-forms, are identified, and their lexical, grammatical and semantic features are discussed.
-
Short term diachronic shifts in part-of-speech frequencies: A comparison of the tagged LOB and F-LOB corpora
Author(s): Christian Mair, Marianne Hundt, Geoffrey N. Leech and Nicholas Smithpp.: 245–264 (20)More LessThe paper presents a comparison of tag frequencies in two matching one-million word reference corpora of British standard English, the 1961 LOB-corpus and its 1991 “clone” produced at Freiburg. Both corpora were tagged using a version of the CLAWS part-of-speech-tagger developed at Lancaster, and part of the material was post-edited manually in Freiburg to assess the accuracy of the automatic procedure. The comparison of tag frequencies is an essential complement to work on recent linguistic change carried out on the untagged material, because this work has been based on the – so far unverified – assumption that tag frequencies have remained constant over the thirty-year period in question. In addition, the paper discusses some common and partly contradictory claims about the prevalence of a “nominal” style in present-day written English. It is shown that while part-of-speech frequencies have not remained constant over the period investigated, the shifts are usually not big enough to invalidate the results obtained in analyses of the untagged material. With regard to style, the material shows a significant rise in the frequency of nouns, which, however, is not paralleled by a corresponding decrease in verbs.
-
Today's corpus linguistics: Some open questions
Author(s): Frantiek Čermákpp.: 265–282 (18)More LessThe paper is concerned with problems of methodology. Against this background, the situation of today's corpora is discussed and some fields are identified as being in a far from satisfactory shape. The place of corpora in linguistics is briefly looked at, suggesting that structuralist tradition is the only one to use them extensively. Problems of annotation and ways, less (statistical) or more successful (rule-based), are raised and discussed. Here, some of the most serious shortcomings, such as multi-word units or status of language units in general that computational linguists should deal with, are listed. In a more general direction, implications and status of paradigmatics and syntagmatics are discussed, too, with considerable and critical attention paid to ontologies.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false
-
-
Comparing Corpora
Author(s): Adam Kilgarriff
-
- More Less