- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 4, Issue, 1999
International Journal of Corpus Linguistics - Volume 4, Issue 2, 1999
Volume 4, Issue 2, 1999
-
A Description of the English-Norwegian Parallel Corpus: Compilation and Further Developments
Author(s): Signe Oksefjellpp.: 197–219 (23)More LessThis paper gives an introduction to the most important steps in the process of compiling the English-Norwegian Parallel Corpus (ENPC), which contains 50 original English text extracts with their translations into Norwegian and 50 original Norwegian text extracts with their translations into English, in all about 2.6 million words. Even if the most time-consuming part of the process is to prepare the text extracts for the corpus, much of the focus has also been on the development of software, notably a browser handling parallel texts and an alignment program linking the original and translated versions of the same text. The preparation of the texts themselves includes scanning, proofreading, mark-up, and alignment.Although the ENPC is completed, the ENPC project is still developing, and the most recent extensions will be mentioned in this paper, such as adding more languages, compiling multiple translations (in the same language) of the same text, part-of-speech-tagging, and marking direct speech and thought in the ENPC.
-
"Agile" and "Uptight" Genres: The Corpus-based Approach to Language Change in Progress
Author(s): Marianne Hundt and Christian Mairpp.: 221–242 (22)More LessThis paper is a follow-up study to previous investigations based on the analysis of parallel British and American corpora from the early 1960s and 1990s. It focuses on variables that are suspected to contribute to the growing "colloquicdisation " of the norms of written English, that is, a narrowing of the gap between spoken and written norms. Such a shift in stylistic preferences has been observed in both socio-cultural approaches to language and corpus-based studies. Contrasting material from the press and academic prose sections of standard one-million-word corpora, we are able to show that the two genres differ in the degree to which they are open to innovations or prone to retain conservative features. What we are proposing is a cline of openness to innovation ranging from "agile " to "uptight" genres.
-
Reconnecting Real Language with Real Texts: Text Linguistics and Corpus Linguistics
Author(s): Robert de Beaugrandepp.: 243–259 (17)More LessThe connections between language and text (or "langue and parole, " or "competence and performance, " etc.) have been thoroughly obscured by a long-standing tendency to attribute an ideal order to language which is not reflected in texts and then to conclude that texts are too disorderly to be worthy of linguistic investigation. So almost a century of academic research has been devoted to constructing the order of ideal language from the top down by sheer theoretical bootstrapping. Today, corpus linguistics is enabling a dramatic return to real language whilst revealing previously undescribed modes of order in the connections between language and text.
-
Towards Automatic Annotation of Anaphoric Links in Corpora
Author(s): Ruslan Mitkovpp.: 261–280 (20)More LessThe paper proposes a methodology for the semi-automatic annotation of pronoun-antecedent pairs in corpora. The proposal is based on robust, knowledge-poor pronoun resolution followed by post-editing.The paper is structured as follows. The introduction comments on the fact that automatic identification of referential links in corpora has lagged behind in comparison with similar lexical, syntactical, and even semantic tasks. The second section of the paper outlines the author s robust, knowledge-based approach to pronoun resolution which will subsequently be put forward as the core of a larger architecture proposed for the automatic tagging of referential links. Section 3 briefly presents other related knowledge-poor approaches, while Section 4 discusses the limitations and advantages of the knowledge-poor approach outlined in Section 2. The main argument of the paper is to be found in Section 5, which presents the idea of developing a semi-automatic environment for annotating anaphoric links and outlines the components of such a program. Finally, the conclusion looks at the anticipated success rate of the approach.
-
The Role of Corpora in Investigating the Linguistic Behaviour of Professional Translators
Author(s): Mona Bakerpp.: 281–298 (18)More LessThe Translational English Corpus held at the Centre for Translation Studies at UMIST is a computerised collection of authentic, published translations into English from a variety of source languages and by a wide range of professional translators. This resource provides the basis for investigating a range of issues related to the distinctive nature of translated text, the style of individual translators, the impact of individual source languages on the patterning of English, the impact of text type on translation strategies, and other issues of interest to both the translation scholar and the linguist. Most importantly, this concrete resource allows us to develop a framework for investigating the validity of theoretical statements about the nature of translation with reference to actual translation practice.
-
LINDA BL 1.0. A Linguistic Digital Assistant for the Analysis of Block Language
Author(s): Maria D. Lopez Maestrepp.: 299–330 (32)More LessIn this paper, we present and discuss a computer programme designed for the linguistic annotation and processing of corpora of Block Language (headlines, proverbs, graffiti, advertising headlines, cinema titles, etc.) in English. LINDA BL 1.0 (LINGUISTIC DIGITAL ASSISTANT FOR THE ANALYSIS OF BLOCK LANGUAGE version 1.0) was designed at the University of Murcia (Spain) to enable the user to study linguistic variation in the sentence structure of Block Language texts from a stylistic point of view and with reference to the social-semiotic environment of the context of situation of these varieties of language.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less