1887
Volume 19, Issue 1
  • ISSN 0929-9971
  • E-ISSN: 1569-9994
USD
Buy:$35.00 + Taxes

Abstract

We report on TExSIS, a flexible bilingual terminology extraction system that uses a sophisticated chunk-based alignment method for the generation of candidate terms, after which the specificity of the candidate terms is determined by combining several statistical filters. Although the set-up of the architecture is largely language-independent, we present terminology extraction results for four different languages and three language pairs. Gold standard data sets were created for French-Italian, French-English and French-Dutch, which allowed us not only to evaluate precision, which is common practice, but also recall. We compared the TExSIS approach, which takes a multilingual perspective from the start, with the more commonly used approach of first identifying term candidates monolingually and then aligning the source and target terms. A comparison of our system with the LUIZ approach described by Vintar (2010) reveals that TExSIS outperforms LUIZ both for monolingual and bilingual terminology extraction. Our results also clearly show that the precision of the alignment is crucial for the success of the terminology extraction. Furthermore, based on the observation that the precision scores for bilingual terminology extraction outperform those of the monolingual systems, we conclude that multilingual evidence helps to determine unithood in less related languages.
Loading

Article metrics loading...

/content/journals/10.1075/term.19.1.01mac
2013-01-01
2019-10-18
Loading full text...

Full text loading...

References

http://instance.metastore.ingenta.com/content/journals/10.1075/term.19.1.01mac
Loading
  • Article Type: Research Article
Keyword(s): alignment , automatic term extraction , bilingual term extraction , chunks and parallel corpora
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error