Term extraction for automatic abstracting
In this paper we describe term extraction from full length journal articles in the domain of crop husbandry for the purpose of producing abstracts automatically. Initially, candidate terms are extracted which occur in one of a number of fixed lexical environments, as found by a system of contextual templates which assigns a semantic role indicator to each candidate term. Candidate terms which can be lexically validated — that is, whose constituent words and structure conform to a simple grammar for their assigned role — receive an enhanced weight. The grammar for lexical validation was derived from a training corpus of 50 journal articles. Selected terms may be used to generate a short abstract which indicates the subject matter of the paper. <br /> We also describe a method for compiling a list of sequences which indicate the statistical findings of an experiment, in particular the interrelationships between terms. Such word sequences, when extracted and appended to an indicative abstract, will produce an informative abstract which describes specific research findings in addition to the subject matter of the paper.