Japanese Term Extraction
  • ISSN 0929-9971
  • E-ISSN: 1569-9994
Buy:$35.00 + Taxes


This article describes a method for extracting terms that combines term frequency with a novel measure of term representativeness (i.e., informativeness or domain specificity). The measure is defined as the normalized distance between the word distribution in the documents which contain the term and the word distribution in the whole corpus. The measure is particularly effective in discarding uninformative terms that frequently appear and has a well-defined threshold value for judging the representativeness of a term. We combined the new measure with term frequency and applied it to the extraction of terms from abstracts of artificial intelligence papers. This article introduces the measure and reports on its effectiveness in term extraction.


Article metrics loading...

Loading full text...

Full text loading...

  • Article Type: Research Article
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error