International Journal of Corpus Linguistics

ISSN 1384-6655
E-ISSN 1569-9811

<em>The International Journal of Corpus Linguistics</em> (IJCL) publishes original research covering methodological, applied and theoretical work in any area of corpus linguistics. Through its focus on empirical language research, IJCL provides a forum for the presentation of new findings and innovative approaches in any area of linguistics (e.g. lexicology, grammar, discourse analysis, stylistics, sociolinguistics, morphology, contrastive linguistics), applied linguistics (e.g. language teaching, forensic linguistics), and translation studies. Based on its interest in corpus methodology, IJCL also invites contributions on the interface between corpus and computational linguistics. The journal has a major reviews section publishing book reviews as well as corpus and software reviews. The language of the journal is English, but contributions are also invited on studies of languages other than English. IJCL occasionally publishes special issues (for details please contact the editor). All contributions are peer-reviewed.

Most cited this month

  • Collostructions: Investigating the interaction of words and constructions
    • Authors: Anatol Stefanowitsch, and Stefan Th. Gries
    • Source: International Journal of Corpus Linguistics, Volume 8, Issue 2, 2003, pages: 209 –243
    • This paper introduces an extension of collocational analysis that takes into account grammatical structure and is specifically geared to investigating the interaction of lexemes and the grammatical constructions associated with them. The method is framed in a construction-based approach to language, i.e. it assumes that grammar consists of signs (form-meaning pairs) and is thus not fundamentally different from the lexicon. The method is applied to linguistic expressions at various levels of abstraction (words, semi-fixed phrases, argument structures, tense, aspect and mood). The method has two main applications: first, to increase the adequacy of grammatical description by providing an objective way of identifying the meaning of a grammatical construction and determining the degree to which particular slots in it prefer or are restricted to a particular set of lexemes; second, to provide data for linguistic theory-building.
  • From key words to key semantic domains
    • Author: Paul Rayson
    • Source: International Journal of Corpus Linguistics, Volume 13, Issue 4, 2008, pages: 519 –549
    • This paper reports the extension of the key words method for the comparison of corpora. Using automatic tagging software that assigns part-of-speech and semantic field (domain) tags, a method is described which permits the extraction of key domains by applying the keyness calculation to tag frequency lists. The combination of the key words and key domains methods is shown to allow macroscopic analysis (the study of the characteristics of whole texts or varieties of language) to inform the microscopic level (focussing on the use of a particular linguistic feature) and thereby suggesting those linguistic features which should be investigated further. The resulting ‘data-driven’ approach presented here combines elements of both the ‘corpus-based’ and ‘corpus-driven’ paradigms in corpus linguistics. A web-based tool, Wmatrix, implementing the proposed method is applied in a case study: the comparison of UK 2001 general election manifestos of the Labour and Liberal Democratic parties.
  • Extending collostructional analysis: A corpus-based perspective on `alternations'
    • Authors: Stefan Th. Gries, and Anatol Stefanowitsch
    • Source: International Journal of Corpus Linguistics, Volume 9, Issue 1, 2004, pages: 97 –129
    • This paper introduces an extension of distinctive-collocate analysis that takes into account grammatical structure and is specifically geared to investigating pairs of semantically similar grammatical constructions and the lexemes that occur in them. The method, referred to as `distinctive-collexeme analysis', identifies lexemes that exhibit a strong preference for one member of the pair as opposed to the other, and thus makes it possible to identify subtle distributional differences between the members of such a pair. The method can be applied in the context of what is sometimes referred to as `grammatical alternation' (e.g. the dative alternation), but it can also be applied to other choices provided by the grammar (such as the two future tense constructions in English). The method has two main applications. First, it can reveal subtle differences between seemingly synonymous constructions, many of which are difficult to identify on the basis of more traditional approaches. Second, it can be used to investigate the very notion of `alternation'; we show that many alternations are much more restricted than has hitherto been assumed, and thus confirm the claims of recent, non-derivational views of grammar.
  • A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing
    • Author: Douglas Biber
    • Source: International Journal of Corpus Linguistics, Volume 14, Issue 3, 2009, pages: 275 –311
    • The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: ‘multi-word lexical collocations’ (combinations of content words) versus ‘multi-word formulaic sequences’ (incorporating both function words and content words).
 Building on this background, the primary focus of the paper is an empirical investigation of the ‘patterns’ represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.
  • The 385+ million word Corpus of Contemporary American English (1990–2008+): Design, architecture, and linguistic insights
    • Author: Mark Davies
    • Source: International Journal of Corpus Linguistics, Volume 14, Issue 2, 2009, pages: 159 –190
    • The Corpus of Contemporary American English (COCA), which was released online in early 2008, is the first large and diverse corpus of American English. In this paper, we first discuss the design of the corpus — which contains more than 385 million words from 1990–2008 (20 million words each year), balanced between spoken, fiction, popular magazines, newspapers, and academic journals. We also discuss the unique relational databases architecture, which allows for a wide range of queries that are not available (or are quite difficult) with other architectures and interfaces. To conclude, we consider insights from the corpus on a number of cases of genre-based variation and recent linguistic variation, including an extended analysis of phrasal verbs in contemporary American English.
