1887
Text Corpora and Multilingual Lexicography
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
USD
Buy:$35.00 + Taxes

Abstract

This contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected at the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is the newspaper Croatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for Information and Culture). After a quick survey of existing English-Croatian parallel corpora, the article copes with procedures involved in text conversion and text encoding, particularly the alignment. There are several recent suggestions for alignment encoding, and they are listed and elaborated at the end of the article.

Loading

Article metrics loading...

/content/journals/10.1075/ijcl.6.si.10tad
2001-12-17
2019-09-20
Loading full text...

Full text loading...

References

http://instance.metastore.ingenta.com/content/journals/10.1075/ijcl.6.si.10tad
Loading
  • Article Type: Research Article
Keyword(s): alignment , CES , corpus encoding , corpus linguistics , Croatian language , English language , parallel corpora and XML
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error