- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 3, Issue, 1998
International Journal of Corpus Linguistics - Volume 3, Issue 1, 1998
Volume 3, Issue 1, 1998
-
Collocational Frameworks in Spanish
Author(s): Christopher S. Butlerpp.: 1–32 (32)More LessCollocational frameworks are discontinuous combinations of grammatical items which enclose lexical words (e.g., a_of in English). In this paper, the concept of collocational framework is applied to the analysis of five corpora of Spanish which allow comparisons across a range of communication channel types. Twenty-eight frameworks, consisting of articles and/or prepositions, are investigated in terms of their quantitative importance in texts and their selectivity for the lexical words they enclose. Although measures of quantitative significance and selectivity show strong correlations across all types of text, these correlations are to some extent sensitive to the spoken/written dimension. The lexical collocates also show clear semantic groupings. Implications of the work for linguistic theory and description and for language teaching and learning are briefly discussed.
-
An Analysis of English Punctuation: The Special Case of Comma
Author(s): Murat Bayraktar, Bilge Say and Varol Akmanpp.: 33–57 (25)More LessPunctuation has usually been ignored by researchers in computational linguistics over the years. Recently, it has been realized that a true understanding of written language will be impossible if punctuation marks are not taken into account. This paper contains the details of a computer-aided exercise to investigate English punctuation practice for the special case of comma (the most significant punctuation mark) in a parsed corpus. The study classifies the various "structural" uses of the comma according to the syntax-patterns in which a comma occurs. The corpus (Penn Treebank) consists of syntactically annotated sentences with no part-of-speech tag information about the individual words.
-
A Recurrent Word Combination Approachto the Study of Formulae in the Speech of Native and Non-Native Speakers of English
Author(s): Sylvie De Cockpp.: 59–80 (22)More LessThis article reports on a pilot study into how corpus methods can be applied to the study of one type of phraseological unit, formulae, in native speaker and learner speech. Formulae, or formulaic expressions, are multi-word units performing a pragmatic and/or discourse-structuring function and have been characterised as being typically native-like. The methodology presented here is contrastive and involves the use of computerised corpora of both native and non-native speaker speech. It consists of two steps: (1) the automatic extraction of all recurrent word combinations to produce lists of potential formulae, and (2) a carefully specified manual filtering process designed to reduce these lists to lists of actual formulaic usage. The results of this process allow for the first genuine quantitative comparison of formulae in the speech of native and non-native speakers, which in turn has significant implications for SLA research. This paper focuses on methodology and does not present a full discussion of the results. However, selected example findings are presented to support the approach adopted.
-
A Corpus-based Analysis of Extended Multiple Themes in PresE
Author(s): Maria Angeles Gomes Gonsalezpp.: 81–113 (33)More LessThis corpus-based study reformulates Halliday's (1994: 55) notion of Multiple Theme, i.e., textual and/or interpersonal items occurring before a simple Topical Theme (or clause initial transitivity/mood element) (e.g., Well, but then, Ann, surely, wouldn't the best idea be to join the group?) (cf. Berry 1982, 1995; Lautamatti 1978; Young 1980; Vasconcellos 1992). Firstly, the label Extended Multiple Theme is here proposed as a cover-term for Topical Themes co-occurring with pre-topical and/or post-topical textual and/or interpersonal elements. And secondly, Extended Multiple Themes are suggested to: (i) allow for recursiveness within the three functional slots; (ii) tend to abide by Dik's (1989: 342) Principle of Centripetal Organisation; and (iii) substantiate the layering hypothesis posited for example in Dik 's Functional Grammar or in Role and Reference Grammar (cf. Hengeveld 1989; Van Valin Jr. 1993). These claims were deduced from the application of three multivariate statistical tests, namely, the Logistic Regression Technique, the Fisher's Exact Test, and the x2 Test, to the tokens of Extended Multiple Themes found in real Present-day English texts, that is to say, in the Lancaster Spoken English Corpus.
-
Further Experiments in Bilingual Text Alignment
Author(s): Harold Somerspp.: 115–150 (36)More LessWe describe and experimentally evaluate an alternative algorithm for aligning and extracting vocabulary from parallel texts using recency vectors and a similarity measure based on Levenshtein distance. The work is largely inspired by Fung and McKeown 's DK-vec, though we use a simpler algorithm. The technique is tested on two sets of parallel corpora involving English, French, German, Dutch, Spanish, and Japanese. We attempt to evaluate the importance of parameters such as frequency of words chosen as candidates, the effect of different language pairings, and differences between the two corpora.
-
Collocational Networks: Interlocking Patterns of Lexis in a Corpusof Plant Biology Research Articles
Author(s): Geoffrey Williamspp.: 151–171 (21)More LessScientific sublanguages evolve in accordance with the needs of the Discourse Community (DC) with new words being coined and a gradual change in the meanings expressed through existing lexis. In so far as the central concepts relate to each other, similar relational patterns emerge in their surface constructs, words. Consequently, the "frame of reference" for a given lexical item is to be found in the genre-specific lexical environment of that word. This is revealed through collocation, as measured using Mutual Information statistics. It is further posited that the conceptual frameworks of scientific sublanguages can be visualised through closed set collocational networks. These networks may be demonstrated locally through digraphs, but the network is posited as a more suitable means of demonstrating the complexity of relationships between individual items. The collocational networks are seen as forming the unique frame of reference for any "word" within a given sublanguage
-
Abstracts
Author(s): Hilde Hasselgård, Juhani Klemola, Susan Pintzuk and Jonathan Hopepp.: 181–187 (7)More Less
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less