Full text loading...
-
Lexical Frequencies in a 300 Million Word Corpus of Australian Newspapers. Analysis and Interpretation
- Source: International Journal of Corpus Linguistics, Volume 5, Issue 2, Jan 2000, p. 147 - 178
Abstract
Corpus linguistics, descriptive, sociolinguistics, and psycholinguistics use corpora and generalise their findings beyond the samples contained in them. That raises the problem of the representativity of the data base and of the application of methods for the presentation of findings. Although this paper originated in the context of the pluricentricity of English in the lexis of mainstream Australian English (mAusE), it was inspired by the current debates about corpus methodology (Kretzschmar et al. 1987). It is based on a large newspaper corpus that extends over a period of six years. It studies the distribution patterns of a small set of lexical items that are derived from Aboriginal languages or relate to Aboriginal concerns. While there appears to be a fairly consistent stable core, these items manifest significant differences in occurrence over the six-year period and in the media outlets and that raises the questions of what a replicate study of these items (or of others) would find and whether a corpus can claim to be representative in the first place.