Volume 34, Issue 2
  • ISSN 0213-2028
  • E-ISSN: 2254-6774
Buy:$35.00 + Taxes



Traditional corpus-based methods rely on manual inspection and extraction of lexical collocates in the study of selection preferences, which is a very costly, labor-intensive, and time-consuming task. Devising automatic methods for lexical collocate extraction becomes necessary to handle this task and the immensity of corpora available. With a view to leveraging the platform and in-built corpora, we propose a working prototype of a Lexical Collocate Extractor (LeCoExt) command-line tool that mines lexical collocates from all types of verbs according to their syntactic constituents and Collocate Frequency Score (CFS). This might be the first tool that performs comprehensive corpus-based studies of the selection preferences of individual or groups of verbs exploiting the capabilities offered by . This tool might facilitate the task of extracting rich lexico-semantic knowledge from diverse corpora in a few seconds and at a click away. We test its performance for ontology building and refinement departing from a previous detailed analysis of stealing verbs carried out by Fernández-Martínez & Faber (2020). We show how the proposed tool is used to extract conceptual-cognitive knowledge from the THEFT scenario and implement it into FunGramKB Core Ontology through the creation and modification of theft-related conceptual units.


Article metrics loading...

Loading full text...

Full text loading...


  1. Asaro, C., Biasiotti, M. A., Guidotti, P., Papini, M., Sagri, M. T., & Tiscornia, D.
    (2003) A domain ontology: Italian crime ontology. InProceedings of the ICAIL 2003 Workshop on Legal Ontologies & Web based legal information management, 1–7.
    [Google Scholar]
  2. Berman, R.
    (1982) On the Nature of ‘Oblique’ Objects in Bitransitive Constructions. Lingua, 56(2), 101–125. 10.1016/0024‑3841(82)90026‑2
    https://doi.org/10.1016/0024-3841(82)90026-2 [Google Scholar]
  3. Boas, H.
    (2013) Frame Semantics and Translation. InA. Rojo & I. Ibarretxte-Antunano (Eds.), Cognitive Linguistics and Translation (pp.125–158). Berlin/New York: Mouton de Gruyter. 10.1515/9783110302943.125
    https://doi.org/10.1515/9783110302943.125 [Google Scholar]
  4. British National Corpus, version 3 (BNC XML Edition)
    British National Corpus, version 3 (BNC XML Edition) (2007) Distributed by Oxford University Computing Services on behalf of the BNC Consortium. Available atwww.natcorp.ox.ac.uk/ [last accessed15 May 2019]
  5. Bušta, J., & Herman, O.
    (2017) JSI Newsfeed Corpus. InThe 9th International Corpus Linguistics Conference, University of Birmingham, 25–28July 2017.
    [Google Scholar]
  6. Dux, R.
    (2018) Frames, Verbs, and Constructions: German Constructions with Verbs of Stealing. InA. Ziem & H. Boas (Eds.), Approaching German Syntax from a Constructionist Perspective (pp.367–405). Berlin/New York: Mouton de Gruyter. 10.1515/9783110457155‑010
    https://doi.org/10.1515/9783110457155-010 [Google Scholar]
  7. Faber, P., & Mairal-Usón, R.
    (1999) Constructing a Lexicon of English Verbs. Berlin: Mouton de Gruyter. 10.1515/9783110800623
    https://doi.org/10.1515/9783110800623 [Google Scholar]
  8. (2018) A Conceptually-Oriented Approach to Semantic Composition in RRG. InR. D. Van Valin (Ed.), The Cambridge Handbook of Role and Reference Grammar. Cambridge: Cambridge University Press.
    [Google Scholar]
  9. Felices-Lago, Á.
    (2014) The emergence of axiology as a key parameter in modern linguistics. InG. Thompson & L. Alba-Juex (eds), Evaluation in Context (pp.27–46). Jon Benjamins. 10.1075/pbns.242.02fel
    https://doi.org/10.1075/pbns.242.02fel [Google Scholar]
  10. (2015) Foundational considerations for the development of the Globalcrimeterm subontology: A research project based on FunGramKB. Onomazéin, 31(1): 127–144. 10.7764/onomazein.31.9
    https://doi.org/10.7764/onomazein.31.9 [Google Scholar]
  11. (2016) The Process of Constructing Ontological Meaning Based on Criminal Law Verbs. Círculo de Lingüística Aplicada a la Comunicación, 65, 109–148. 10.5209/rev_CLAC.2016.v65.51983
    https://doi.org/10.5209/rev_CLAC.2016.v65.51983 [Google Scholar]
  12. Fernández-Martínez, N. J., & Faber, P.
    (2020) Who stole what from whom? A corpus-based, cross-linguistic study of English and Spanish verbs of stealing. Languages in Contrast, 20(1): 107–140. 10.1075/lic.19002.fer
    https://doi.org/10.1075/lic.19002.fer [Google Scholar]
  13. Fillmore, C., & Baker, C.
    (2010) A Frames Approach to Semantic Analysis. InB. Heine & H. Narrog (Eds.), The Oxford Handbook of Linguistic Analysis (pp.313–340). New York: Oxford University Press.
    [Google Scholar]
  14. Gangemi, A., Sagri, M., & Tiscornia, D.
    (2005) A Constructive Framework for Legal Ontologies. InV. R. Benjamins (Eds.), Law and the Semantic Web (pp.97–124). Berlin: Springer. 10.1007/978‑3‑540‑32253‑5_7
    https://doi.org/10.1007/978-3-540-32253-5_7 [Google Scholar]
  15. Goldberg, A.
    (2010) Verbs, Constructions and Semantic Frames. InM. Rappaport-Hovav, E. Doron and I. Sichel (Eds.), Syntax, Lexical Semantics and Event Structure (pp.39–58). Oxford: Oxford University Press. 10.1093/acprof:oso/9780199544325.003.0003
    https://doi.org/10.1093/acprof:oso/9780199544325.003.0003 [Google Scholar]
  16. Jakubíček, M., Kilgarriff, A., McCarthy, D., & Rychlý, P.
    (2010) Fast Syntactic Searching in Very Large Corpora for Many Languages. PACLIC, 741–747.
    [Google Scholar]
  17. Jakubíček, M., Kilgarriff, A., Kovář, V., Rychlý, P., & Suchomel, V.
    (2013) The TenTen Corpus Family. Seventh International Corpus Linguistics Conference CL, 125–127.
    [Google Scholar]
  18. Jiménez-Briones, R., & Luzondo-Oyón, A.
    (2011) Building Ontological Meaning in a Lexico-conceptual Knowledge Base. Onomázein, 23, 11–40.
    [Google Scholar]
  19. Kilgarriff, A., Vojtěch, K., Krek, S., Srdanovič, I., & Tiberius, C.
    (2010) A Quantitative Evaluation of Word Sketches. Proceedings of the 14th EURALEX International Congress, 372–379.
    [Google Scholar]
  20. Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V.
    (2014) The Sketch Engine: Ten Years on. Lexicography, 1, 7–36. Available atwww.sketchengine.co.uk [last accessed28 December 2018]
    [Google Scholar]
  21. Leary, R., Vandenberghe, W., & Zeleznikow, J.
    (2004) Towards a financial fraud ontology: a legal modelling approach, ICAIL 2003 Workshop on Legal Ontologies & Web based legal information management, 1–33.
    [Google Scholar]
  22. Lenci, A.
    (2000) SIMPLE: A general framework for the development of multilingual lexicon. International Journal of Lexicography, 13(4), 249–263. 10.1093/ijl/13.4.249
    https://doi.org/10.1093/ijl/13.4.249 [Google Scholar]
  23. Masolo, C.
    (2003) WonderWeb Deliverable D18: Ontology Library. Laboratory for Applied Ontology, ISTC-CNR.
    [Google Scholar]
  24. McCarthy, D., Kilgarrif, A., Jakubíček, M., & Reddy, S.
    (2015) Semantic Word Sketches. Corpus Linguistics (CL2015), 1–5.
    [Google Scholar]
  25. Miller, G., & Fellbaum, C.
    (2007) WordNet Then and Now. Language Resources and Evaluation, 41(2), 209–214. Available athttps://wordnet.princeton.edu/ [last accessed17 May 2019] 10.1007/s10579‑007‑9044‑6
    https://doi.org/10.1007/s10579-007-9044-6 [Google Scholar]
  26. Niles, I., & Pease, A.
    (2001) Towards a standard Upper Ontology. InProceedings of the Second International Conference on Formal Ontology in Information Systems. Ogunquit. Available atwww.adampease.org/professional/FOIS.pdf [last accessed10 January 2019] 10.1145/505168.505170
    https://doi.org/10.1145/505168.505170 [Google Scholar]
  27. Pedersen, B. S., & Keson, B.
    (1999) SIMPLE–Semantic information for multifunctional plurilingual lexica: some examples of Danish concrete nouns. Proceedings of the SIGLEX-99 Workshop. Maryland. Available atclair.eecs.umich.edu/aan/paper.php?paper_id=W99-0507#pdf [last accessed15 January 2019]
    [Google Scholar]
  28. Periñán-Pascual, C.
    (2012) En defensa del procesamiento del lenguaje natural fundamentado en la lingüística teórica. Onomázein, 26, 13–48.
    [Google Scholar]
  29. (2013) A knowledge-engineering approach to the cognitive categorization of lexical meaning. VIAL – Vigo International Journal of Applied Linguistics, 10, 85–104.
    [Google Scholar]
  30. Periñán-Pascual, C., & Arcas-Túnez, F.
    (2004) Meaning postulates in a lexico-conceptual knowledge base. 15th International Workshop on Databases and Expert Systems Applications, IEEE, Los Alamitos (California), 38–42. 10.1109/DEXA.2004.1333446
    https://doi.org/10.1109/DEXA.2004.1333446 [Google Scholar]
  31. (2005) Microconceptual-Knowledge Spreading in FunGramKB. Proceedings of the 9th IASTED International Conference on Artificial Intelligence and Soft Computing. Anaheim-Calgary-Zurich: ACTA Press, 239–244.
    [Google Scholar]
  32. (2010a) The architecture of FunGramKB. Proceedings of the 7th International Conference on Language Resources and Evaluation, European Language Resources Association (ELRA), 2667–2674.
    [Google Scholar]
  33. (2010b) Ontological commitments in FunGramKB. Procesamiento del Lenguaje Natural, 44, 27–34.
    [Google Scholar]
  34. Periñán-Pascual, C., & Mairal-Usón, R.
    (2009) Bringing Role and Reference Grammar to Natural Language Understanding. Procesamiento del Lenguaje Natural, 43, 265–273.
    [Google Scholar]
  35. (2010) La gramática de COREL: un lenguaje de representación conceptual. Onomázein, 21, 11–45.
    [Google Scholar]
  36. (2011) The COHERENT Methodology in FunGramKB. Onomázein, 24,13–33.
    [Google Scholar]
  37. Ruiz-de-Mendoza Ibáñez, F., & Mairal-Usón, R.
    (2009) Constructing meaning: a brief overview of the Lexical Constructional Model. InMario Brdar (Ed.), Converging and diverging tendencies in Cognitive Linguistics. Amsterdam/Philadelphia: John Benjamins.
    [Google Scholar]
  38. Ruppenhofer, J., Boas, H., & Baker, C.
    (2017) FrameNet. InP. Fuertes-Olivera (Ed.), The Routledge Handbook of Lexicography (pp.383–398). New York: Routledge. 10.4324/9781315104942‑25
    https://doi.org/10.4324/9781315104942-25 [Google Scholar]
  39. Rychlý, P.
    (2008) A Lexicographer-Friendly Association Score. Proceedings of the 2nd Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN, 2, 6–9.
    [Google Scholar]
  40. Sartor, G., Casanovas, P., Biasotti, M. A., & Fernández-Barrera, M.
    (Eds.) (2011) Approaches to legal ontologies, theories, domains, methodologies, Berlin: Springer. 10.1007/978‑94‑007‑0120‑5
    https://doi.org/10.1007/978-94-007-0120-5 [Google Scholar]
  41. Thorgren, S.
    (2005) Transaction Verbs: A Lexical and Semantic Analysis of Rob and Steal. Reports from the Department of Language and Culture, 3, 1–44.
    [Google Scholar]
  42. Valente, A.
    (2005) Types and roles of legal ontologies. InR. Benjamins, P. Casonovas, J. Breuker & A. Gangemi (Eds.), Law and the semantic web (pp.65–76). Berlin: Springer. 10.1007/978‑3‑540‑32253‑5_5
    https://doi.org/10.1007/978-3-540-32253-5_5 [Google Scholar]
  43. Van Valin, R.
    (2005) Exploring the Syntax-Semantics Interface. Cambridge: Cambridge University Press. 10.1017/CBO9780511610578
    https://doi.org/10.1017/CBO9780511610578 [Google Scholar]
  44. Velardi, P., Pazienza, M., & Fasolo, M.
    (1991) How to Encode Semantic Knowledge: A Method for Meaning Representation and Computer-aided Acquisition. Computational Linguistics, 17(2), 153–170.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error