1887
image of Repérage automatisé de l’hyponymie dans des corpus spécialisés en français à l’aide de Sketch Engine
USD
Buy:$35.00 + Taxes

Abstract

Abstract

Hyponymy is an essential semantic relation in terminology, as it represents the hierarchical organization of concepts. Much has been written about hyponymy extraction. However, terminologists working with French do not currently have user-friendly and freely available tools to automatically extract hyper-hyponymic pairs from their own corpora. This paper presents the most recent version of the ESSG (EcoLexicon Semantic Sketch Grammar) methodology, a knowledge-pattern-based approach that enables Sketch Engine to extract semantic relations. This methodology is applied to the development and evaluation of the ESSG-fr, a semantic sketch grammar for hyponymy extraction in French. The evaluation results show that the ESSG-fr is a reliable domain-independent tool for terminologists wishing to extract simple hyper-hyponymic pairs and the corresponding concordances from specialized corpora.

Loading

Article metrics loading...

/content/journals/10.1075/term.20044.san
2022-05-12
2022-05-26
Loading full text...

Full text loading...

References

  1. Auger, Alain
    1997 « Repérage des énoncés d’intérêt définitoire dans les bases de données textuelles ». Thèse de doctorat, Université de Neuchâtel.
    [Google Scholar]
  2. Auger, Alain, et Caroline Barrière
    2010 « Probing Semantic Relations ». DansProbing Semantic Relations: Exploration and Identification in Specialized Texts, sous la directiond’Alain Auger et Caroline Barrière, 1–18. Amsterdam: John Benjamins. 10.1075/bct.23.01aug
    https://doi.org/10.1075/bct.23.01aug [Google Scholar]
  3. Aussenac-Gilles, Nathalie, et Marie-Paule Jacques
    2008 « Designing and Evaluating Patterns for Relation Acquisition from Texts with CAMÉLÉON ». Terminology14 (1): 45–73. 10.1075/term.14.1.04aus
    https://doi.org/10.1075/term.14.1.04aus [Google Scholar]
  4. Aussenac-Gilles, Nathalie, et Patrick Séguéla
    2000 « Les relations sémantiques : du linguistique au formel ». Cahiers de grammaire, 25: 175–98.
    [Google Scholar]
  5. Barrière, Caroline, et Akakpo Agbago
    2006 « TerminoWeb: A Software Environment for Term Study in Rich Contexts ». DansConference on Terminology, Standardisation and Technology Transfer (TSTT 2006), 103–13. Pékin.
    [Google Scholar]
  6. Barsalou, Lawrence W.
    2010 « Ad Hoc Categories ». DansThe Cambridge Encyclopedia of the Language Sciences, sous la direction dePatrick Colm Hogan, 86–87. New York: Cambridge University Press.
    [Google Scholar]
  7. Borillo, Andrée
    1996 « Exploration automatisée de textes de spécialité : repérage et identification de la relation lexicale d’hyperonymie ». Linx34–35: 113–24. 10.3406/linx.1996.1421
    https://doi.org/10.3406/linx.1996.1421 [Google Scholar]
  8. Bowker, Lynne
    1997 « Multidimensional Classification of Concepts and Terms ». DansHandbook of Terminology Management: Volume 1: Basic Aspects of Terminology Management, sous la direction deSue Ellen Wright et Gerhard Budin, 133–43. Amsterdam: John Benjamins. 10.1075/z.htm1.16bow
    https://doi.org/10.1075/z.htm1.16bow [Google Scholar]
  9. 2003 « Lexical Knowledge Patterns, Semantic Relations, and Language Varieties: Exploring the Possibilities for Refining Information Retrieval in an International Context ». Cataloging & Classification Quarterly37 (1–2): 153–71. 10.1300/J104v37n01_11
    https://doi.org/10.1300/J104v37n01_11 [Google Scholar]
  10. Condamines, Anne
    2000 « “Chez” dans un corpus de sciences naturelles : un marqueur de relation méronymique? » Cahiers de lexicologie77: 165–87.
    [Google Scholar]
  11. 2005 « Anaphore nominale infidèle et hyperonymie : le rôle du genre textuel ». Revue de Sémantique et Pragmatique18: 23–42.
    [Google Scholar]
  12. 2008 « Taking Genre into Account When Analysing Conceptual Relation Patterns ». Corpora3 (2): 115–40. 10.3366/E1749503208000129
    https://doi.org/10.3366/E1749503208000129 [Google Scholar]
  13. 2018 « Terminological Knowledge Bases from Texts to Terms, from Terms to Texts ». The Routledge Handbook of Lexicography, sous la direction dePedro A. Fuertes-Olivera, 335–49. Oxford: Routledge.
    [Google Scholar]
  14. Cruse, Alan
    2011Meaning in Language. 3e éd. Oxford: Oxford University Press.
    [Google Scholar]
  15. Dancette, Jeanne
    2011 « L’intégration des relations sémantiques dans les dictionnaires spécialisés multilingues : du corpus ciblé à l’organisation des connaissances ». Meta56 (2): 284–300. 10.7202/1006177ar
    https://doi.org/10.7202/1006177ar [Google Scholar]
  16. Drouin, Patrick
    2003 « Term Extraction Using Non-technical Corpora as a Point of Leverage ». Terminology9 (1): 99–115. 10.1075/term.9.1.06dro
    https://doi.org/10.1075/term.9.1.06dro [Google Scholar]
  17. 2010 « Extracting a Bilingual Transdisciplinary Scientific Lexicon ». DanseLexicography in the 21st century: new challenges, new applications, sous la direction deSylviane Granger et Magali Paquot, 43–53. Louvain-la-Neuve: Presses Universitaires de Louvain.
    [Google Scholar]
  18. Faber, Pamela
    (dir.) 2012A Cognitive Linguistics View of Terminology and Specialized Language. Berlin, Boston: De Gruyter Mouton. 10.1515/9783110277203
    https://doi.org/10.1515/9783110277203 [Google Scholar]
  19. 2015 « Frames as a Framework for Terminology ». DansHandbook of Terminology, sous la direction deHendrik. J. Kockaert et Frieda Steurs, vol.1:14–33. Amsterdam: John Benjamins. 10.1075/hot.1.02fra1
    https://doi.org/10.1075/hot.1.02fra1 [Google Scholar]
  20. Faber, Pamela, Pilar León Araúz, et Juan Antonio Prieto Velasco
    2009 « Semantic Relations, Dynamicity, and Terminological Knowledge Bases ». Current Issues in Language Studies1(1):1–23.
    [Google Scholar]
  21. Faralli, Stefano, Els Lefever, et Simone Paolo Ponzetto
    2018 « MISA: Multilingual “ISA” Extraction from Corpora ». DansLREC 2018 – 11th International Conference on Language Resources and Evaluation, sous la direction deNicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, , 2040–44. Miyazaki: ELRA.
    [Google Scholar]
  22. Garcia, Daniela
    1998 « Exploitation, pour l’élaboration de requêtes de filtrage de textes, des connaissances causales détectées par COATIS ». DansRencontre internationale sur le filtrage et le résumé automatique (RIFRA’98), 44–54. Sfax.
    [Google Scholar]
  23. Halskov, Jakob, et Caroline Barrière
    2008 « Web-Based Extraction of Semantic Relation Instances for Terminology Work ». Terminology14 (1): 20–44. 10.1075/term.14.1.03hal
    https://doi.org/10.1075/term.14.1.03hal [Google Scholar]
  24. Hearst, Marti A.
    1992 « Automatic Acquisition of Hyponyms from Large Text Corpora ». DansActes de COLING-92, 539–45. Morristown, NJ: International Committee on Computational Linguistics. 10.3115/992133.992154
    https://doi.org/10.3115/992133.992154 [Google Scholar]
  25. Jakubíček, Miloš, Adam Kilgarriff, Diana McCarthy, et Pavel Rychlý
    2010 « Fast Syntactic Searching in Very Large Corpora for Many Languages ». DansProceedings of the 24th Pacific Asia Conference on Language, Information and Computation, sous la direction deRyo Otoguro, Kiyoshi Ishikawa, Hiroshi Umemoto, Kei Yoshimoto, et Yasunari Harada, 741–47. Sendai: Institute of Digital Enhancement of Cognitive Processing, Waseda University.
    [Google Scholar]
  26. Jouis, Christophe
    1995 « SEEK, un logiciel d’acquisition des connaissances utilisant un savoir linguistique sans employer de connaissances sur le monde externe ». DansActes des 6ème Journées Acquisition, Validation (JAVA 95), INRIA, 159–72. Grenoble.
    [Google Scholar]
  27. Kageura, Kyo
    1997 « Multifaceted/Multidimensional Concept Systems ». DansHandbook of Terminology Management. Volume 1: Basic Aspects of Terminology Management, sous la direction deSue Ellen Wright et Gerhard Budin, 119–32. Amsterdam: John Benjamins. 10.1075/z.htm1.15kag
    https://doi.org/10.1075/z.htm1.15kag [Google Scholar]
  28. Kilgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý, et Vít Suchomel
    2014 « The Sketch Engine: Ten Years On ». Lexicography1 (1): 7–36. 10.1007/s40607‑014‑0009‑9
    https://doi.org/10.1007/s40607-014-0009-9 [Google Scholar]
  29. L’Homme, Marie-Claude
    2020Lexical Semantics for Terminology. Amsterdam: John Benjamins. 10.1075/tlrp.20
    https://doi.org/10.1075/tlrp.20 [Google Scholar]
  30. Laurence, Stephen, et Eric Margolis
    1999 « Concepts and Cognitive Science ». DansConcepts: Core Readings, sous la directiond’Eric Margolis et Stephen Laurence, 3–81. Cambridge, MA: MIT Press.
    [Google Scholar]
  31. Lefeuvre, Luce, Kevin Coustot, Anne Condamines, et Josette Rebeyrolle
    2017 « MAR-REL : Liste de candidats-marqueurs français pour les relations d’hyperonymie, de méronymie et de cause ». Toulouse : Laboratoire Cognition, Langues, Langage, Ergonomie (CLLE). redac.univ-tlse2.fr/misc/mar-rel/Liste-des-marqueurs.pdf
  32. León-Araúz, Pilar
    2017 « Term and Concept Variation in Specialized Knowledge Dynamics ». DansMultiple perspectives on Terminological Variation, sous la direction dePatrick Drouin, Aline Francœur, John Humbley, et Aurélie Picton, 213–58. Amsterdam: John Benjamins. 10.1075/tlrp.18.09leo
    https://doi.org/10.1075/tlrp.18.09leo [Google Scholar]
  33. León-Araúz, Pilar, et Pamela Faber
    2010 « Natural and Contextual Constraints for Domain-Specific Relations ». DansProceedings of the Workshop Semantic Relations. Theory and Applications, sous la direction deVerginica Barbu Mititelu, Viktor Pekar, et Eduard Barbu, 12–17. La Vallette.
    [Google Scholar]
  34. León-Araúz, Pilar, et Antonio San Martín
    2012 « Multidimensional Categorization in Terminological Definitions ». DansProceedings of the 15th EURALEX International Congress, sous la direction deRuth Vatvedt Fjeld et Julie Matilde Torjusen, 578–84. Oslo: EURALEX.
    [Google Scholar]
  35. 2018 « The EcoLexicon Semantic Sketch Grammar: From Knowledge Patterns to Word Sketches ». DansProceedings of the LREC 2018 Workshop “Globalex 2018 – Lexicography & WordNets”, sous la directiond’Ilan Kerneman et Simon Krek, 94–99. Miyazaki: Globalex.
    [Google Scholar]
  36. León-Araúz, Pilar, Antonio San Martín, et Pamela Faber
    2016 « Pattern-Based Word Sketches for the Extraction of Semantic Relations ». DansProceedings of the 5th International Workshop on Computational Terminology, sous la direction dePatrick Drouin, Natalia Grabar, Thierry Hamon, Kyo Kageura, et Koichi Takeuchi, 73–82. Osaka.
    [Google Scholar]
  37. León-Araúz, Pilar, Antonio San Martín, et Arianne Reimerink
    2018 « The EcoLexicon English Corpus as an Open Corpus in Sketch Engine ». DansProceedings of the 18th EURALEX International Congress, sous la direction deJaka Čibej, Vojko Gorjanc, Iztok Kosem, et Simon Krek, 893–901. Ljubljana: Euralex.
    [Google Scholar]
  38. Maia, Belinda, et Sérgio Matos
    2008 « Corpógrafo V. 4 – Tools for Researchers and Teachers Using Comparable Corpora ». DansProceedings of the LREC 2008 Workshop on Comparable Corpora, sous la direction dePierre Zweigenbaum, Éric Gaussier, et Pascale Fung, 79–82. Marrakesh.
    [Google Scholar]
  39. Malaisé, Véronique, Pierre Zweigenbaum, et Bruno Bachimont
    2004 « Detecting Semantic Relations between Terms in Definitions ». Dans3rd Edition of CompuTerm Workshop (CompuTerm 2004), sous la direction deSophia Ananiadou et Pierre Zweigenbaum, 55–62. Genève.
    [Google Scholar]
  40. Marshman, Elizabeth
    2014 « Enriching Terminology Resources with Knowledge-Rich Contexts: A Case Study ». Terminology20 (2): 225–49. 10.1075/term.20.2.05mar
    https://doi.org/10.1075/term.20.2.05mar [Google Scholar]
  41. Marshman, Elizabeth, Marie-Claude L’Homme, et Victoria Surtees
    2008 « Verbal Markers of Cause-Effect Relations across Corpora ». DansManaging Ontologies and Lexical Resources. Proceedings of the 8th International Conference on Terminology and Knowledge Engineering, TKE’2008, sous la direction deBodil Nistrup Madsen et Hanne Erdman Thomsen, 159–73. Copenhagen.
    [Google Scholar]
  42. Meyer, Ingrid
    2001 « Extracting Knowledge-Rich Contexts for Terminography ». DansRecent Advances in Computational Terminology, sous la direction deDidier Bourigault, Christian Jacquemin, et Marie-Claude L’Homme, 279–302. Amsterdam: John Benjamins. 10.1075/nlp.2.15mey
    https://doi.org/10.1075/nlp.2.15mey [Google Scholar]
  43. Meyer, Ingrid, Karen Eck, et Douglas Skuce
    1997 « Systematic Concept Analysis within a Knowledge-Based Approach to Terminology ». DansHandbook of Terminology Management. Volume 1: Basic Aspects of Terminology Management, sous la direction deSue Ellen Wright et Gerhard Budin, 98–118. Amsterdam: John Benjamins. 10.1075/z.htm1.14mey
    https://doi.org/10.1075/z.htm1.14mey [Google Scholar]
  44. Meyer, Ingrid, Kristen Mackintosh, Caroline Barrière, et Tricia Morgan
    1999 « Conceptual Sampling for Terminographical Corpus Analysis. » DansProceedings of Terminology and Knowledge Engineering, TKE’1999, 256–67. Innsbruck.
    [Google Scholar]
  45. Morin, Emmanuel
    1999 « Acquisition de patrons lexico-syntaxiques caractéristiques d’une relation sémantique ». Traitement automatique des langues40: 143–66.
    [Google Scholar]
  46. Murphy, M. Lynne, et Anu Koskela
    2010Key Terms in Semantics. Londres, New York: Continuum.
    [Google Scholar]
  47. Nazar, Rogelio, Jorge Vivaldi, et Leo Wanner
    2012 « Automatic Taxonomy Extraction for Specialized Domains Using Distributional Semantics ». Terminology18 (2): 188–225. 10.1075/term.18.2.03naz
    https://doi.org/10.1075/term.18.2.03naz [Google Scholar]
  48. Pantel, Patrick, et Marco Pennacchiotti
    2006 « Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations ». DansCOLING/ACL 2006 – 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, sous la direction deNicoletta Calzolari, Claire Cardie, et Pierre Isabelle, 113–20. Sydney. 10.3115/1220175.1220190
    https://doi.org/10.3115/1220175.1220190 [Google Scholar]
  49. Rebeyrolle, Josette, et Ludovic Tanguy
    2000 « Repérage automatique de structures linguistiques en corpus : le cas des énoncés définitoires ». Cahiers de grammaire25: 153–74.
    [Google Scholar]
  50. Rogers, Margaret
    2004 « Multidimensionality in Concepts Systems: A Bilingual Textual Perspective ». Terminology10 (2): 215–40. 10.1075/term.10.2.04rog
    https://doi.org/10.1075/term.10.2.04rog [Google Scholar]
  51. Rosch, Eleanor
    1978 « Principles of Categorization ». DansCognition and Categorization, sous la directiond’Eleanor Rosch et Barbara B. Lloyd, 27–48. Hillsdale, NJ: Lawrence Erlbaum Associates.
    [Google Scholar]
  52. Rosch, Eleanor, Carolyn B. Mervis, Wayne D. Gray, David M. Johnson, et Penny Boyes-Braem
    1976 « Basic Objects in Natural Categories ». Cognitive Psychology8 (3): 382–439. 10.1016/0010‑0285(76)90013‑X
    https://doi.org/10.1016/0010-0285(76)90013-X [Google Scholar]
  53. Rychlý, Pavel
    2016 « Evaluation of the Sketch Engine Thesaurus on Analogy Queries ». DansTenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2016, sous la directiond’Aleš Horák, Pavel Rychlý, et Adam Rambousek, 147–52. Brno: Tribun EU.
    [Google Scholar]
  54. San Martín, Antonio
    2016 « La representación de la variación contextual mediante definiciones terminológicas flexibles ». Thèse de doctorat, Université de Grenade.
    [Google Scholar]
  55. 2022 « A Flexible Approach to Terminological Definitions: Representing Thematic Variation ». International Journal of Lexicography35(1): 53–74. 10.1093/ijl/ecab013
    https://doi.org/10.1093/ijl/ecab013 [Google Scholar]
  56. San Martín, Antonio, Catherine Trekker, et Pilar León-Araúz
    2020 « Extraction of Hyponymic Relations in French with Knowledge-Pattern-Based Word Sketches ». DansProceedings of The 12th Language Resources and Evaluation Conference, sous la direction deNicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, , 5955–63. Marseille: European Language Resources Association.
    [Google Scholar]
  57. Schutz, Alexander, et Paul Buitelaar
    2005 « RelExt: A Tool for Relation Extraction from Text in Ontology Extension ». DansThe Semantic Web – ISWC 2005. ISWC 2005. Lecture Notes in Computer Science, sous la direction deYolanda Gil, Enrico Motta, V. Richard Benjamins, et Mark A. Musen, 593–606. Berlin, Heidelberg: Springer. 10.1007/11574620_43
    https://doi.org/10.1007/11574620_43 [Google Scholar]
  58. Taylor, John R.
    2003Linguistic Categorization. 3e éd. Oxford: Oxford University Press.
    [Google Scholar]
  59. Tiedemann, Jörg
    2012 « Parallel Data, Tools and Interfaces in OPUS ». DansProceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, sous la direction deNicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, et Stelios Piperidis, 2214–18. Istanbul.
    [Google Scholar]
  60. Van Campenhoudt, Marc
    2004 « Réseau sémantique et approche componentielle des bases de données lexicales multilingues ». International Journal of Lexicography17(2), 155–60. 10.1093/ijl/17.2.155
    https://doi.org/10.1093/ijl/17.2.155 [Google Scholar]
  61. Wu, Wentao, Hongsong Li, Haixun Wang, et Kenny Q. Zhu
    2012 « Probase: A Probabilistic Taxonomy for Text Understanding ». DansProceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12), 149–50. New York: Association for Computing Machinery. 10.1145/2213836.2213891
    https://doi.org/10.1145/2213836.2213891 [Google Scholar]
http://instance.metastore.ingenta.com/content/journals/10.1075/term.20044.san
Loading
/content/journals/10.1075/term.20044.san
Loading

Data & Media loading...

  • Article Type: Research Article
Keywords: corpus ; word sketches ; knowledge patterns ; hyponymy ; hyponym extraction
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error