- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 7, Issue, 2002
International Journal of Corpus Linguistics - Volume 7, Issue 1, 2002
Volume 7, Issue 1, 2002
-
The IJS-ELAN Slovene-English Parallel Corpus
Author(s): Toma Erjavecpp.: 1–20 (20)More LessThe paper presents an annotated parallel Slovene-English corpus developed in the scope of the EU ELAN project. The IJS-ELAN corpus was compiled to be a widely distributable dataset for language engineering and for translation and terminology studies. The corpus contains 1 million words from fifteen recent terminology-rich texts. The corpus is sentence aligned and word-tagged with context disambiguated morphosyntactic descriptions and lemmas. These descriptions model simple feature structures, the structure of which is shared between Slovene and English. The corpus is encoded according to the Guidelines for Text Encoding and Interchange and is freely available on the Web for downloading. Additionally, access to IJS-ELAN is available via a powerful Web concordancer.
-
Starting with Xhosa English towards a spoken corpus
Author(s): Vivian de Klerkpp.: 21–42 (22)More LessThis paper describes the underlying motivation for the proposed structure and design of a corpus of Xhosa English, which aims ultimately to form part of a larger corpus of Black South African English (BSAE). The planned corpus will be exclusively based on spoken spontaneous Xhosa English, and full justification for this decision is provided in the paper. In particular the paper argues that the current South African English component of the International Corpus of English (ICE) cannot be regarded as representative of any particular variety of South African English, because of the wide range of Englishes spoken in the country (by mother-tongue speakers, Indians, white and coloured Afrikaans speakers and the speakers of South Africa's nine indigenous languages). In addition, the article problematises theoretical concepts such as deciding what “educated” or standard English is (in a multilingual country with a very complex socio-political history), and argues that some of the text categories of ICE and other spoken corpora are inappropriate for the planned Xhosa English corpus.
-
In search of representativity in specialised corpora: Categorisation through collocation
Author(s): Geoffrey Williamspp.: 43–64 (22)More LessIn large reference corpora representativeness is attempted through carefully selected sampling and sheer size. The situation is different with special language corpora in that their very nature limits them in size. Their representativity is measured by reference to external selection criteria, generally following bibliographic classifications, which tend to be subjective. In order to overcome subjectivity in specialised corpora, a corpus-directed system of internal selection using lexical criteria is proposed. The aim is not to create rigid boundaries but to see clearly what is actually present in the corpus. The method adopted is demonstrated on a corpus consisting of research articles from specialised journals and conference proceedings in the field of plant biology. Restricted collocational networks are used to isolate prototypical groupings within the corpus. It is shown that audience is an important factor in strong and weak prototypical groupings in theme and domain specific corpora. Articles addressing domain specialists through a journal tend to be more central than those presented to a theme-specific discourse community through conference proceedings.
-
Understanding Direct Mail Letters as a genre
Author(s): Thomas A. Uptonpp.: 65–85 (21)More LessWhat makes non-profit, philanthropic discourse so persuasive has not been well explored to date. Using a specialized corpus of direct-mail letters from philanthropic organizations in five different fields, this study seeks to combine the tools of corpus analysis with the specificity of genre analysis in a way that has not been done before to provide a new perspective on a genre that is not well understood. The underlying goal is to look for a methodology that will provide much of the qualitative detail that is common to genre analysis, while at the same time providing the reliability that is best assured by the quantitative power of computerized corpus analysis. Using Bhatia's approach to genre analysis (1993) and his exploratory efforts in investigating fundraising discourse (1997, 1998) as a foundation, key patterns in the rhetorical structure of direct-mail letters revealed through a large-scale corpus analysis are presented.
-
The contribution of verbal semantic content towards term recognition
Author(s): Eugenia Eumeridoupp.: 87–106 (20)More LessAutomatic term recognition is a natural language processing technology which is gaining increasing prominence in our information-overloaded society. Apart from its use for quick and efficient updating of terminologies and thesauri, it has also been used for machine translation, information retrieval, document indexing and classification as well as content representation. Until very recently, term identification techniques rested solely on the mapping of term linguistic properties onto computational procedures. However, actual terminological practice has shown that context is also important for term identification and interpretation as terms may appear in different forms depending on the situation of use. The aim of this article is to show the importance of contextual information for automatic term recognition by exploiting the relation between verbal semantic content and term occurrence in three subcorpora drawn from the British National Corpus.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false
-
-
Comparing Corpora
Author(s): Adam Kilgarriff
-
- More Less