- Home
- e-Journals
- Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication
- Previous Issues
- Volume 6, Issue, 2000
Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication - Volume 6, Issue 2, 2000
Volume 6, Issue 2, 2000
-
Recent advances in automatic term recognition: Experiences from the NTCIR workshop on information retrieval and term recognition
Author(s): Kyo Kageura, Masaharu Yoshioka, Koichi Takeuchi, Teruo Koyama, Keita Tsuji and Fuyuki Yoshikanepp.: 151–173 (23)More LessThis article provides basic background information on the articles included in this special issue on Japanese term extraction, by (i) clarifying the basic background of research into automatic term recognition, (ii) explaining briefly the ‘contest’-style workshop we organised in 1999, and (iii) briefly summarising the ATR methodologies proposed in the articles, and positioning their ideas, philosophies and methodologies within ATR from a unified perspective. Through this information, we intend to consolidate the contributions of the NTCIR TMREC workshop, and, hopefully, clarify a basic framework for discussion which different researchers can use to constructively communicate with each other about automatic term extraction and beyond.
-
An application and e aluation of the C/NC-value approach for the automatic term recognition of multi-word units in Japanese
Author(s): Hideki Mima and Sophia Ananiadoupp.: 175–194 (20)More LessTechnical terms are important for knowledge mining, especially as vast amounts of multi-lingual documents are available over the Internet. Thus, a domain and language-independent method for term recognition is necessary to automatically recognize terms from Internet documents.The C-/NC-value method is an efficient domain-independent multi-word term recognition method which combines linguistic and statistical knowledge. Although the C-value/NC-value method is originally based on the recognition of nested terms in English, our aim is to evaluate the application of the method to other languages and to show its feasibility for multi-language environments.In this article, we describe the application of the C/NC-value method to Japanese texts. Several experiments analysing the performance of the method using the NACSIS Japanese AI-domain corpus demonstrate that the method can be utilized to realize a practical domain-and language-independent term rec-ognition system.
-
Automatic term recognition based on statistics of compound nouns
Author(s): Hirosi Nakagawapp.: 195–210 (16)More LessThe NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.
-
Extracting terms by a combination of term frequency and a measure of term representativeness
Author(s): Toru Hisamitsu, Yoshiki Niwa, Shingo Nishioka, Hirofumi Sakurai, Osamu Imaichi, Makoto Iwayama and Akihiko Takanopp.: 211–232 (22)More LessThis article describes a method for extracting terms that combines term frequency with a novel measure of term representativeness (i.e., informativeness or domain specificity). The measure is defined as the normalized distance between the word distribution in the documents which contain the term and the word distribution in the whole corpus. The measure is particularly effective in discarding uninformative terms that frequently appear and has a well-defined threshold value for judging the representativeness of a term. We combined the new measure with term frequency and applied it to the extraction of terms from abstracts of artificial intelligence papers. This article introduces the measure and reports on its effectiveness in term extraction.
-
Term recognition using corpora from different fields
Author(s): Kiyotaka Uchimoto, Satoshi Sekine, Masaki Murata, Hiromi Ozaku and Hitoshi Isaharapp.: 233–256 (24)More LessWe present a system used in the term recognition competition, one of the subtasks covered by the NTCIR tmrec group, and we evaluate its term recognition results. We regard that terms are lexical items, characteristic of a field, which have the following three features: (1) they appear frequently in documents of the target field; (2) they are not common words in the target field; and (3) they appear less frequently in the corpora of other fields. Our system uses corpora from different fields and uses these features to recognize terms.We then analyze the differences between our term list and the manual candidates list produced by the NTCIR tmrec group. In this article we identify features that are important for automatic term recognition. Furthermore, through comparative experiments based on manual candidates, we establish the importance of indices in extracting a term list.
-
Statistical and linguistic approaches to automatic term recognition: NTCIR experiments at Matsushita
Author(s): Yoshio Fukushige and Naohiko Noguchipp.: 257–286 (30)More LessIn this article we describe our approaches and the results to the Term Recognition (TMREC) task in the first NTCIR Workshop on Research in Japanese Text. Retrieval and Term Recognition, held 30 August-1 September 1999. Our first approach aims to collect words that appear distinctively in documents of the target domain through statistical method. Our second approach aims to collect terms that have a particular inner structure by applying several diagnostic tests using the collocational information in the corpus. Section 1 describes the outline of the term recognition task. Section 2 briefly describes the two approaches, details of which are described in Sections 3 and 4. In Section 6, we offer a short discussion based on the comparison between the candidates.
-
Japanese term extraction using dictionary hierarchy and machine translation system
Author(s): Jong-Hoon Oh, Juho Lee, Kyung-Soon Lee and Key-Sun Choipp.: 287–311 (25)More LessThere have been many studies of automatic term recognition (ATR) and they have achieved good results. However, they focus on a mono-lingual term extraction method. Therefore, it is difficult to extract terms from documents in foreign languages. This article describes an automatic term extraction method from documents in foreign languages using a machine translation system. In our method, we translate documents in foreign languages into documents in Korean and extract terms in the translated Korean documents. Finally the terms recognized from the Korean documents are translated into terms in the foreign language. By using our method, one can extract terms for languages, which one does not know.
-
Using author keywords for automatic term recognition
Author(s): Masao Utiyama, Masaki Murata and Hitoshi Isaharapp.: 313–326 (14)More LessThis paper proposes a method which regards the keywords provided by the authors of technical papers as terms and learns the statistics which distinguish terms from non-terms. Since it uses keywords as training data, it requires no hand-labeled training corpora manually annotated with terms. The proposed method was used to extract terms from the NTCIR morphologically tagged corpus and achieved 0.800 recall and 0.431 precision. The effectiveness of the proposed method has thus been demonstrated.
Volumes & issues
-
Volume 29 (2023)
-
Volume 28 (2022)
-
Volume 27 (2021)
-
Volume 26 (2020)
-
Volume 25 (2019)
-
Volume 24 (2018)
-
Volume 23 (2017)
-
Volume 22 (2016)
-
Volume 21 (2015)
-
Volume 20 (2014)
-
Volume 19 (2013)
-
Volume 18 (2012)
-
Volume 17 (2011)
-
Volume 16 (2010)
-
Volume 15 (2009)
-
Volume 14 (2008)
-
Volume 13 (2007)
-
Volume 12 (2006)
-
Volume 11 (2005)
-
Volume 10 (2004)
-
Volume 9 (2003)
-
Volume 8 (2002)
-
Volume 7 (2001)
-
Volume 6 (2000)
-
Volume 5 (1998)
-
Volume 4 (1997)
-
Volume 3 (1996)
-
Volume 2 (1995)
-
Volume 1 (1994)
Most Read This Month
Article
content/journals/15699994
Journal
10
5
false

-
-
Methods of automatic term recognition: A review
Author(s): Kyo Kageura and Bin Umino
-
- More Less