Full text loading...
-
Automatic acquisition of verb subcategorization information by exploiting mininal linguistic resources
- Source: International Journal of Corpus Linguistics, Volume 9, Issue 1, Jan 2004, p. 1 - 28
- Previous Article
- Table of Contents
- Next Article
Abstract
A set of well known statistical filtering methods (binomial hypothesis testing, log-likelihood ratio, t-test, thresholds on relative frequencies) is used on Modern Greek and English corpora in order to automatically acquire verb subcategorization frames that are not limited in number and are not known beforehand. As sophisticated linguistic resources and tools are not available for most languages (including Modern Greek), pre-processing of our corpora reaches merely the stage of elementary, intrasentential, non-embedded phrase chunking. By forming, permutating and counting subsets of the verb's neighboring set of phrases, and by applying the statistical filters mentioned previously, valid syntactic frames of verbs are detected. The results achieved were comparable to and, in several cases, better than the ones of previous approaches, even approaches utilizing richer resources. Incorporating the extracted list of frames into a shallow parser, the performance of the latter increases by almost 6%, showing thereby the importance of the acquired knowledge.