Full text loading...
-
Automatic lexical collocate extraction for corpus-based ontology building and refinement
A FunGramKB case study of the THEFT conceptual scenario
- Source: Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics, Volume 34, Issue 2, Dec 2021, p. 435 - 463
-
- 30 May 2019
- 27 Mar 2020
- 15 Dec 2021
Abstract
Abstract
Traditional corpus-based methods rely on manual inspection and extraction of lexical collocates in the study of selection preferences, which is a very costly, labor-intensive, and time-consuming task. Devising automatic methods for lexical collocate extraction becomes necessary to handle this task and the immensity of corpora available. With a view to leveraging the Sketch Engine platform and in-built corpora, we propose a working prototype of a Lexical Collocate Extractor (LeCoExt) command-line tool that mines lexical collocates from all types of verbs according to their syntactic constituents and Collocate Frequency Score (CFS). This might be the first tool that performs comprehensive corpus-based studies of the selection preferences of individual or groups of verbs exploiting the capabilities offered by Sketch Engine. This tool might facilitate the task of extracting rich lexico-semantic knowledge from diverse corpora in a few seconds and at a click away. We test its performance for ontology building and refinement departing from a previous detailed analysis of stealing verbs carried out by Fernández-Martínez & Faber (2020). We show how the proposed tool is used to extract conceptual-cognitive knowledge from the THEFT scenario and implement it into FunGramKB Core Ontology through the creation and modification of theft-related conceptual units.