1887
Volume 4, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
USD
Buy:$35.00 + Taxes

Abstract

This paper gives an introduction to the most important steps in the process of compiling the English-Norwegian Parallel Corpus (ENPC), which contains 50 original English text extracts with their translations into Norwegian and 50 original Norwegian text extracts with their translations into English, in all about 2.6 million words. Even if the most time-consuming part of the process is to prepare the text extracts for the corpus, much of the focus has also been on the development of software, notably a browser handling parallel texts and an alignment program linking the original and translated versions of the same text. The preparation of the texts themselves includes scanning, proofreading, mark-up, and alignment.Although the ENPC is completed, the ENPC project is still developing, and the most recent extensions will be mentioned in this paper, such as adding more languages, compiling multiple translations (in the same language) of the same text, part-of-speech-tagging, and marking direct speech and thought in the ENPC.

Loading

Article metrics loading...

/content/journals/10.1075/ijcl.4.2.01oks
1999-01-01
2023-03-30
Loading full text...

Full text loading...

http://instance.metastore.ingenta.com/content/journals/10.1075/ijcl.4.2.01oks
Loading
  • Article Type: Research Article
Keyword(s): Alignment; Browser for Parallel Texts; Mark-up; Parallel Corpus; Tagging
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error