- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 5, Issue, 2000
International Journal of Corpus Linguistics - Volume 5, Issue 2, 2000
Volume 5, Issue 2, 2000
-
The Propagation of Core Lexicons Using On-line Language Resources and Savoir Faire
Author(s): Evelyne Viegaspp.: 133–145 (13)More LessIn this article, we discuss methodologies to extend computational semantic lexicons in a cost effective way using on-line language resources and savoir faire. First, we introduce the ecology of computational semantic lexicon acquisition, presenting two main methodologies: thesaurus-driven versus corpus-driven. Second, we describe an experiment to extend a semantics-based core lexicon with paradigmatic relations and predict the syntactic behavior of verbs based on their semantics; the automatically derived subcategorizations are first checked against corpora and then manually filtered. These lexicons have been developed within Mikrokosmos, a semantics-based machine translation system.
-
Lexical Frequencies in a 300 Million Word Corpus of Australian Newspapers. Analysis and Interpretation
Author(s): Gerhard Leitnerpp.: 147–178 (32)More LessCorpus linguistics, descriptive, sociolinguistics, and psycholinguistics use corpora and generalise their findings beyond the samples contained in them. That raises the problem of the representativity of the data base and of the application of methods for the presentation of findings. Although this paper originated in the context of the pluricentricity of English in the lexis of mainstream Australian English (mAusE), it was inspired by the current debates about corpus methodology (Kretzschmar et al. 1987). It is based on a large newspaper corpus that extends over a period of six years. It studies the distribution patterns of a small set of lexical items that are derived from Aboriginal languages or relate to Aboriginal concerns. While there appears to be a fairly consistent stable core, these items manifest significant differences in occurrence over the six-year period and in the media outlets and that raises the questions of what a replicate study of these items (or of others) would find and whether a corpus can claim to be representative in the first place.
-
The Process of Designing a Multidisciplinary Monolingual Sample Corpus
Author(s): N.S. Dashpp.: 179–197 (19)More LessThis paper discusses the approach of developing a sample of printed corpus in Bangla, one of the national languages of India and the only national language of Bangladesh. It is designed from the data collected from various published documents. The paper highlights different issues related to corpus generation, data-file preparation, language analysis, and processing as well as application potentials to different areas of pure and applied linguistics. It also includes statistical studies on the corpus along with some interpretation of the results. The difficulties that one may face during corpus generation are also pointed out.
-
Learning Lessons from Bilingual Corpora: Benefits for Machine Translation
Author(s): Oliver Streiter and Leonid L. Iomdinpp.: 199–230 (32)More LessThe research described in this paper is rooted in the endeavors to combine the advantages of corpus-based and rule-based MT approaches in order to improve the performance of MT systems—most importantly, the quality of translation. The authors review the ongoing activities in the field and present a case study, which shows how translation knowledge can be drawn from parallel corpora and compiled into the lexicon of a rule-based MT system. These data are obtained with the help of three procedures: (1) identification of hence unknown one-word translations, (2) statistical rating of the known one-word translations, and (3) extraction of new translations of multiword expressions (MWEs) followed by compilation steps which create new rules for the MT engine. As a result, the lexicon is enriched with translation equivalents attested for different subject domains, which facilitates the tuning of the MT system to a specific subject domain and improves the quality and adequacy of translation.
-
Contextual Clues to Word-Meaning
Author(s): Antoinette Renouf and Laurie Bauerpp.: 231–258 (28)More Less
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false
-
-
Comparing Corpora
Author(s): Adam Kilgarriff
-
- More Less