Determining semantic equivalence of terms in information retrieval
An important issue in Information Retrieval is determining the semantic equivalence between terms in a query and terms in a document. We propose an approach based on context distance and morphology. Context distance is a measure we use to assess the closeness of word meanings. This context distance model compares the similarity of the contexts where a word appears, using the local document information and the global lexical co-occurrence information derived from the entire set of documents to be retrieved. We integrate this context distance model with morphological analysis in determining semantic equivalence of terms so that the two operations can enhance each other. Using the standard vector-space model, we evaluated the proposed method on a subset of TREC-4 corpus (AP88 and AP90 collection, 158,240 documents, 49 queries). Results show that this method improves the 11-point average precision by 8.6%.