- Home
- e-Journals
- Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication
- Previous Issues
- Volume 10, Issue, 2004
Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication - Volume 10, Issue 1, 2004
Volume 10, Issue 1, 2004
-
Lexically-based terminology structuring
Author(s): Natalia Grabar and Pierre Zweigenbaumpp.: 23–53 (31)More LessTerminology structuring has been the subject of much work in the context of terms extracted from corpora: given a set of terms, obtained from an existing resource or extracted from a corpus, it consists in identifying hierarchical (or other types of) relations between these terms. The present work aims at assessing the feasibility of such structuring by studying it on an existing hierarchically structured terminology. Our overall goal is to test various structuring methods proposed in the literature and to check how they fare on this task. The specific goal at the present stage of our work, which we report here, is focussed on lexical methods that match terms on the basis on their content words, taking morphological variants and synonyms into account. We describe experiments performed on the French version of the US National Library of Medicine MeSH thesaurus. We compare the lexically-induced relations with the original MeSH relations and measure recall and precision metrics, taking two different views on the task: relation recovery and term placement. This method proposes correct term placement for up to 26% of the MeSH concepts, and its precision can reach 58%. After this quantitative evaluation, we perform a qualitative, human analysis of the ‘new’ relations not present in the MeSH. This analysis shows, on the one hand, the limits of the lexical structuring method. On the other hand, it reveals some specific structuring choices and naming conventions made by the MeSH designers, and emphasizes ontological commitments that cannot be left to automatic structuring.
-
Mining term similarities from corpora
Author(s): Goran Nenadic, Irena Spasic and Sophia Ananiadoupp.: 55–80 (26)More LessIn this article, we present an approach to the automatic discovery of term similarities, which may serve as a basis for a number of term-oriented knowledge mining tasks. The method for term comparison combines internal (lexical similarity) and two types of external criteria (syntactic and contextual similarities). Lexical similarity is based on sharing lexical constituents (i.e. term heads and modifiers). Syntactic similarity relies on a set of specific lexico-syntactic co-occurrence patterns indicating the parallel usage of terms (e.g., within an enumeration or within a term coordination/conjunction structure), while contextual similarity is based on the usage of terms in similar contexts. Such contexts are automatically identified by a pattern mining approach, and a procedure is proposed to assess their domain-specific and terminological relevance. Although automatically collected, these patterns are domain dependent and identify contexts in which terms are used. Different types of similarities are combined into a hybrid similarity measure, which can be tuned for a specific domain by learning optimal weights for individual similarities. The suggested similarity measure has been tested in the domain of biomedicine, and some experiments are presented.
-
Alignment and extraction of bilingual legal terminology from context profiles
Author(s): Oi Yee Kwong, Benjamin K. Tsou and Tom B. Y. Laipp.: 81–99 (19)More LessIn this study, we propose a method for aligning terms and extracting translations from a small, domain-specific corpus consisting of parallel English and Chinese court judgments from Hong Kong. With a sentence-aligned corpus, translation equivalents are suggested by analysing the frequency profiles of parallel concordances. The method overcomes the limitations of conventional statistical methods which require large corpora to be effective, and those of lexical approaches which depend on existing bilingual dictionaries. Pilot testing on a parallel corpus of about 113K Chinese words and 120K English words gives an encouraging 79% precision and 38% recall on average. The method has its own limitations such as failure to detect multiple candidates and secondary translations, but it provides a good basis for acquiring an initial translation lexicon for legal terminology from indigenous bilingual legal texts.
-
Abducing term variant translations in aligned texts
Author(s): Michael Carl, Ecaterina Rascu, Johann Haller and Philippe Langlaispp.: 101–130 (30)More LessTerm variation is an important issue in various applications of natural language processing (NLP) such as machine translation, information retrieval and text indexing. In this paper, we describe an ‘Abductive Terminological Database’ (ATDB) aiming to detect translations of terms and their variants in bilingual texts. We describe abduction as the process to infer specific term translation templates from multiple resources which have been induced from a bilingual text. We show that precision and recall of the ATDB increase when using more resources and when the resources interfere in a less restricted way. We discuss a way to feed back evaluation values into the induced resources thus allowing for weighted abduction which further enhances the precision of the tool.
-
General-purpose statistical translation engine and domain specific texts: Would it work ?
Author(s): Philippe Langlais and Michael Carlpp.: 131–153 (23)More LessThe past decade has witnessed exciting work in the field of Statistical Machine Translation (SMT). However, accurate evaluation of its potential in real-life contexts is still an open question. In this study, we investigate the behavior of an SMT engine faced with a corpus far different from the one it has been trained on. We show that terminological databases are obvious resources that should be used to boost the performance of a statistical engine. We propose and evaluate one way of integrating terminology into a SMT engine which yields a significant reduction in word error rate.
-
Question terminology and representation for question type classification
Author(s): Noriko Tomuropp.: 153–168 (16)More LessQuestion terminology is a set of terms which appear in keywords, idioms and fixed expressions commonly observed in questions. This paper investigates ways to automatically extract question terminology from a corpus of questions and represent them for the purpose of classifying by question type. Our key interest is to see whether or not semantic features can enhance the representation of strongly lexical nature of question sentences. We compare two feature sets: one with lexical features only, and another with a mixture of lexical and semantic features. For evaluation, we measure the classification accuracy made by two machine learning algorithms, C5.0 and PEBLS, by using a procedure called domain cross-validation, which effectively measures the domain transferability of features.
Volumes & issues
-
Volume 30 (2024)
-
Volume 29 (2023)
-
Volume 28 (2022)
-
Volume 27 (2021)
-
Volume 26 (2020)
-
Volume 25 (2019)
-
Volume 24 (2018)
-
Volume 23 (2017)
-
Volume 22 (2016)
-
Volume 21 (2015)
-
Volume 20 (2014)
-
Volume 19 (2013)
-
Volume 18 (2012)
-
Volume 17 (2011)
-
Volume 16 (2010)
-
Volume 15 (2009)
-
Volume 14 (2008)
-
Volume 13 (2007)
-
Volume 12 (2006)
-
Volume 11 (2005)
-
Volume 10 (2004)
-
Volume 9 (2003)
-
Volume 8 (2002)
-
Volume 7 (2001)
-
Volume 6 (2000)
-
Volume 5 (1998)
-
Volume 4 (1997)
-
Volume 3 (1996)
-
Volume 2 (1995)
-
Volume 1 (1994)
Most Read This Month
Article
content/journals/15699994
Journal
10
5
false
-
-
Methods of automatic term recognition: A review
Author(s): Kyo Kageura and Bin Umino
-
- More Less