1887
Volume 31, Issue 1
  • ISSN 0929-9971
  • E-ISSN: 1569-9994

Abstract

Abstract

A crucial task in any type of terminology work is identifying and extracting terms from relevant sources, which can be done manually or via (semi-)automatic term extraction processes. Given the recent advances in automatic term extraction (ATE) research, this paper explores the impact of ATE on terminology work in institutional settings (academic institutions, administrations, European institutions and international organizations) based on qualitative data. The analysis of 15 semi-structured expert interviews conducted in 2023 shows that the newest advances in research in ATE have not had an immediate impact on terminology practices in institutional settings for the study participants. This paper aims to discuss the reasons for the slow uptake of ATE in institutional settings, such as the gap between ATE tools developed in research and ATE components integrated in off-the-shelf terminology or corpus management systems, the lack of integration into existing workflows, the lack of support for certain languages, especially for less-resourced languages, as well as reasons related to source materials.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/term.00085.wis
2025-05-23
2025-06-24
Loading full text...

Full text loading...

/deliver/fulltext/term.00085.wis.html?itemId=/content/journals/10.1075/term.00085.wis&mimeType=html&fmt=ahah

References

  1. Andersen, Gisle
    2022 “Utilising heterogeneous language resources for term extraction in maritime domains.” Terminology International Journal of Theoretical and Applied Issues in Specialized Communication28 (1): 1–36. 10.1075/term.20024.and
    https://doi.org/10.1075/term.20024.and [Google Scholar]
  2. Andersen, Gisle, and Peder Gammeltoft
    2022 “The Role of CLARIN in Advancing Terminology: The Case of Termportalen — the National Terminology Portal for Norway.” InCLARIN: The Infrastructure for Language Resources, ed. byDarja Fišer and Andreas Witt, 249–274. Berlin, Boston: De Gruyter. 10.1515/9783110767377‑010
    https://doi.org/10.1515/9783110767377-010 [Google Scholar]
  3. Anthony, Laurence
    2013 “Developing AntConc for a new generation of corpus linguists.” InProceedings of the Corpus Linguistics Conference (CL 2013), 14–16. Lancaster: Lancaster University.
    [Google Scholar]
  4. Chiocchetti, Elena, and Natascia Ralli
  5. Chiocchetti, Elena, Natascia Ralli, and Tanja Wissik
    2014 Terminology workflows in theory and practice. InProceedings of the 19th European Symposium on Languages for Special Purposes. “Languages for Special Purposes in a Multilingual, Transcultural World”, 8–10 July 2013, Vienna, Austria, 525–535. Vienna: University of Vienna.
    [Google Scholar]
  6. Chiocchetti, Elena, Vesna Lušicky, and Tanja Wissik
    2023 „Multilingual Legal Terminology Databases. Workflows and Roles.“ InHandbook of Terminology, vol 3: Legal Terminology, ed. byŁucja Biel and Hendrik J. Kockaert, 458–484. Amsterdam/Philadelphia: John Benjamins. 10.1075/hot.3.mul1
    https://doi.org/10.1075/hot.3.mul1 [Google Scholar]
  7. Daille, Béatrice
    2017Term Variation in Specialized Corpora. Amsterdam/Philadelphia: John Benjamins. 10.1075/tlrp.19
    https://doi.org/10.1075/tlrp.19 [Google Scholar]
  8. Di Nunzio, Giorgio Maria, Stefano Marchesin, Gianmaria Silvello
    2023 “A systematic review of Automatic Term Extraction: What happened in 2022?Digital Scholarship in the Humanities38 (Supplement_1): i41–i47. 10.1093/llc/fqad030
    https://doi.org/10.1093/llc/fqad030 [Google Scholar]
  9. Dobrina, Claudia
    2015 “Getting to the core of terminological projects.” InHandbook of Terminology, vol. 1, ed. byHendrik Kockaert and Frieda Steurs, 180–199. Amsterdam/Philadelphia: John Benjamins. 10.1075/hot.1.get1
    https://doi.org/10.1075/hot.1.get1 [Google Scholar]
  10. Drewer, Petra, and Klaus-Dirk Schmitz
    2017Terminologiemanagement. Grundlagen — Methoden — Werkzeuge. Berlin: Springer. 10.1007/978‑3‑662‑53315‑4
    https://doi.org/10.1007/978-3-662-53315-4 [Google Scholar]
  11. Drouin, Patrick
    2003 “Term extraction using non-technical corpora as a point of leverage.” Terminology International Journal of Theoretical and Applied Issues in Specialized Communication9 (1): 99–115. 10.1075/term.9.1.06dro
    https://doi.org/10.1075/term.9.1.06dro [Google Scholar]
  12. Frantzi, Katerina, Sophia Ananiadou, and Hideki Mima
    2000 “Automatic recognition of multi-word terms.” International Journal of Digital Libraries3 (2), 117–132. 10.1007/s007999900023
    https://doi.org/10.1007/s007999900023 [Google Scholar]
  13. Frérot, Cécile, and Cristina Valentini
    2020 “Constitution d’un corpus de contextes définitoires dans le domaine de la propriété intellectuelle: vers la définition de structures linguistiques dans les brevets.” InTerminologie & Ontologie: Théories et Applications. Actes de la conférence TOTh 2020, 283–306. Chambery: Presses Universitaires Savoie Mont Blanc.
    [Google Scholar]
  14. Gervais, Dan
    2003 MultiTrans™ System Presentation Translation Support and Language Management Solutions. InProceedings of Machine Translation Summit IX: System Presentations. September 23–27, 2003. New Orleans, USA. https://aclanthology.org/2003.mtsummit-systems.6.pdf
    [Google Scholar]
  15. Giagkou, Maria, Teresa Lynn, Jane Dunne, Stelios Piperidis, and Georg Rehm
    2023 European Language Technology in 2022/2023. In: European Language Equality, ed. byGeorg Rehm and Andy Way, 75–93. Heidelberg/New York/Dordrecht/London: Springer. 10.1007/978‑3‑031‑28819‑7_4
    https://doi.org/10.1007/978-3-031-28819-7_4 [Google Scholar]
  16. Gius, Evelyn, Jan Christoph Meister, Malte Meister, Marco Petris, Mareike Schumacher, and Dominik Gerstorfer
    2023 CATMA 7 (Version 7.0). Zenodo. 10.5281/zenodo.1470118
    https://doi.org/10.5281/zenodo.1470118 [Google Scholar]
  17. Guest, Greg, Arwen Bunce, and Laura Johnson
    2006 “How many interviews are enough? An experiment with data saturation and variability”. Field Methods18 (1), 59–82. 10.1177/1525822X05279903
    https://doi.org/10.1177/1525822X05279903 [Google Scholar]
  18. Haddad Haddad, Amal, Ayla Rigouts Terryn, Ruslan Mitkov, Reinhard Rapp, Pierre Zweigenbaum, and Serge Sharoff
    (eds) 2023Proceedings of the Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTS) Incorporating the 16th Workshop on Building and Using Comparable Corpora (BUCC). Varna, Bulgaria. Shoumen: INCOMA Ltd.
    [Google Scholar]
  19. Hazem, Amir, Mérieme Bouhandi, Florian Boudin, and Beatrice Daille
    2020 “TermEval 2020: TALN-LS2N System for Automatic Term Extraction.” InProceedings of the 6th International Workshop on Computational Terminology, 95–100. Marseille, France. European Language Resources Association. https://aclanthology.org/2020.computerm-1.13/
    [Google Scholar]
  20. Heylen, Kirs, Dirk De Hertog
    2015 “Automatic Term Extraction.” InHandbook of Terminology, vol. 1, ed. byHendrik Kockaert and Frieda Steurs, 204–221. Amsterdam/Philadelphia: John Benjamins. 10.1075/hot.1.aut1
    https://doi.org/10.1075/hot.1.aut1 [Google Scholar]
  21. International Organisation for Standardisation
    International Organisation for Standardisation 2025Management of terminology resources — Terminology extraction (ISO Standard No. ISO 5078:2025 (en)).
    [Google Scholar]
  22. Jakubíček, Miloš, Adam Kilgarriff, Vojtěch Kovář, Pavel Rychlý, and Vít Suchomel
    2014 “Finding Terms in Corpora for Many Languages with the Sketch Engine.” InProceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, 53–56. Gothenburg, Sweden: Association for Computational Linguistics. 10.3115/v1/E14‑2014
    https://doi.org/10.3115/v1/E14-2014 [Google Scholar]
  23. Jakubíček, Miloš, Ondřej Matuška and Marek Blahuš
    2023 “Corpus-based Bilingual Terminology Extraction using One-Click Terms”. InBook of Abstracts of the twelfth international Corpus Linguistics Conference (CL2023). Lancaster: University of Lancaster.
    [Google Scholar]
  24. Janke, Regine
    2013Anforderungen an die Terminologieextraktion: Eine vergleichende Untersuchung der Bedürfnisse von Terminologen, Technischen Fachübersetzern und Technischen Redakteuren. Stuttgart: tcworld.
    [Google Scholar]
  25. Jemec Tomazin, Mateja, Mitja Trojar, Simon Atelšek, Tanja Fajfar, Tomaž Erjavec and Mojca Žagar Karer
    2021Corpus of term-annotated texts RSDO5 1.1. Slovenian language resource repository CLARIN.SI. hdl.handle.net/11356/1470
    [Google Scholar]
  26. Joshi, Pratik, Sebastin Santy, Amar Budhiraja, Kalika Bali, and Monojit Choudhury
    2020 “The State and Fate of Linguistic Diversity and Inclusion in the NLP World.” InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 6282–6293. Association for Computational Linguistics. 10.18653/v1/2020.acl‑main.560
    https://doi.org/10.18653/v1/2020.acl-main.560 [Google Scholar]
  27. Kageura, Kyo and Bin Umino
    1996 “Methods of automatic term recognition. A review.” Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication3 (2): 259–289. 10.1075/term.3.2.03kag
    https://doi.org/10.1075/term.3.2.03kag [Google Scholar]
  28. Kageura, Kyo and Elisabeth Marshman
    2019 “Terminology extraction and management.” InThe Routledge Handbook of Translation and Technology, ed. byMinako O’Hagan, 61–771. London: Routledge. 10.4324/9781315311258‑4
    https://doi.org/10.4324/9781315311258-4 [Google Scholar]
  29. Kilgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý, and Vít Suchomel
    2014 “The Sketch Engine: ten years on.” Lexicography, 11: 7–36. 10.1007/s40607‑014‑0009‑9
    https://doi.org/10.1007/s40607-014-0009-9 [Google Scholar]
  30. Lefever, Els, and Ayla Rigouts Terryn
    2024 Computational Terminology. InNew Advances in Translation Technology. Applications and Pedagogy, ed. ByYuhong Pen, Huihui Huan and Efeng Li, 141–159. Singapore: Springer. 10.1007/978‑981‑97‑2958‑6_8
    https://doi.org/10.1007/978-981-97-2958-6_8 [Google Scholar]
  31. Meuser, Michael, and Ulrike Nagel
    1991 “Experteninterviews — vielfach erprobt, wenig bedacht.” InQualitative-empirische Sozialforschung. Konzepte, Methoden, Analysen, ed. byDetlef Gerz, and Klaus Karaimer, 441–471. Opladen: Westdeutscher Verlag. 10.1007/978‑3‑322‑97024‑4_14
    https://doi.org/10.1007/978-3-322-97024-4_14 [Google Scholar]
  32. Nicholas, Gabriel, and Aliya Bhatia
    2023Lost in Translation: Large Language Models in Non-English Content Analysis. Washington D.C.: Center for Democracy & Technology. (https://cdt.org/insights/lostin-translation-large-language-models-in-non-english-content-analysis/ (accessed10.03.2024)
    [Google Scholar]
  33. Nuopponen, Anita
    2018 “Dimensions of Terminology work.” Terminologija25 (1): 6–22.
    [Google Scholar]
  34. Rehm, Georg and Hans Uszkoreit
    (eds) 2012META-NET White Paper Series: Europe’s Languages in the Digital Age. Heidelberg/New York/Dordrecht/London: Springer. 31 volumes on 30 European languages. (www.meta-net.eu/whitepapers)
    [Google Scholar]
  35. Rigouts Terryn, Ayla, Véronique Hoste, Els Lefever
    2020 In no uncertain terms: a dataset for monolinugal and multilingual automatic term extraction from comparable corpora. Language Resources & Evaluation (2020) 541: 385–418. 10.1007/s10579‑019‑09453‑9
    https://doi.org/10.1007/s10579-019-09453-9 [Google Scholar]
  36. Rigouts Terryn, Ayla, Véronique Hoste, and Els Lefever
    2022a “Tagging terms in text. A supervised sequential labelling approach to automatic term extraction.” Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication28 (1): 157–189. 10.1075/term.21010.rig
    https://doi.org/10.1075/term.21010.rig [Google Scholar]
  37. 2022b “D-Terminer: Online Demo for Monolingual and Bilingual Automatic Term Extraction.” InProceedings of the Workshop on Terminology in the 21st Century: Many Faces, Many Places, co-located with the LREC 2022 conference, 33–40. European Language Resources Association (ELRA).
    [Google Scholar]
  38. Rigouts Terryn, Ayla
    2022cACTER (Annotated Corpora for Term Extraction Research)v1.5, Eurac Research CLARIN Centre, hdl.handle.net/20.500.12124/47
    [Google Scholar]
  39. Šajatović, Antonio, Maja Buljan, Jan Šnajder, and Bojana Dalbelo Bašić
    2019 “Evaluating Automatic Term Extraction Methods on Individual Documents.” InProceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), 149–154, Florence, Italy. Association for Computational Linguistics. 10.18653/v1/W19‑5118
    https://doi.org/10.18653/v1/W19-5118 [Google Scholar]
  40. Scott, Mike
    2008 “Developing WordSmith.” InSpecial Issue of International Journal of English Studies Monograph: Software-aided Analysis of Language8 (1): 95–106, ed. byPascual Pérez-Paredes, Mike Scott, and Purificación Sánchez-Hernández.
    [Google Scholar]
  41. Steurs, Frieda, Ken De Wachter and Evy De Malsche
    2015 “Terminology tools.” InHandbook of Terminology, vol. 1, ed. byHendrik Kockaert and Frieda Steurs, 222–249. Amsterdam/Philadelphia: John Benjamins. 10.1075/hot.1.12ter3
    https://doi.org/10.1075/hot.1.12ter3 [Google Scholar]
  42. Tran, Hanh Thi Hong, Matej Martinc, Jaya Caporusso, Antoine Doucet and Senja Pollak
    2022 The Recent Advances in Automatic Term Extraction: A survey. Preprint. Submitted to ACMhttps://arxiv.org/abs/2301.06767
  43. Valentini, Cristina, Geoffrey Westgate, and Philippe Rouquet
    2016 “The PCT Termbase of the World Intellectual Property Organization: Designing a database for multilingual patent terminology.” Terminology, 22(2): 171–200. 10.1075/term.22.2.02val
    https://doi.org/10.1075/term.22.2.02val [Google Scholar]
  44. Warburton, Kara
    2022The Corporate Terminologist. Amsterdam/Philadelphia: John Benjamins.10.1075/tlrp.21
    [Google Scholar]
  45. Wissik, Tanja
    2024 “Dimensions of sustainability in terminology practice in institutional settings.” Terminology Science & Research / Terminologie: Science et Recherche271: 93–116.
    [Google Scholar]
  46. Žagar Karer, Mojca, and Tanja Fajfar
    2023 “Terminological problems of terminology users: Analysis of questions in terminological counselling service on the Terminologišče website.” Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication29 (2): 78–102. 10.1075/term.21046.zag
    https://doi.org/10.1075/term.21046.zag [Google Scholar]
  47. Zorrilla-Agut, Paula, and Thierry Fontenelle
    2019 “IATE 2. Modernising the EU’s IATE terminological database to respond to the challenges of today’s translation world and beyond.” Terminology International Journal of Theoretical and Applied Issues in Specialized Communication25 (2): 146–174. 10.1075/term.00034.zor
    https://doi.org/10.1075/term.00034.zor [Google Scholar]
/content/journals/10.1075/term.00085.wis
Loading
/content/journals/10.1075/term.00085.wis
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error