1887
Volume 23, Issue 2
  • ISSN 0929-9971
  • E-ISSN: 1569-9994
USD
Buy:$35.00 + Taxes

Abstract

This paper presents a methodology for the automatic extraction of specialized Arabic, English and French verbs of the field of computing. Since nominal terms are predominant in terminology, our interest is to explore to what extent verbs can also be part of a terminological analysis. Hence, our objective is to verify how an existing extraction tool will perform when it comes to specialized verbs in a given specialized domain. Furthermore, we want to investigate any particularities that a language can represent regarding verbal terms from the automatic extraction perspective. Our choice to operate on three different languages reflects our desire to see whether the chosen tool can perform better on one language compared to the others. Moreover, given that Arabic is a morphologically rich and complex language, we consider investigating the results yielded by the extraction tool. The extractor used for our experiment is TermoStat ( Drouin 2003 ). So far, our results show that the extraction of verbs of computing represents certain differences in terms of quality and particularities of these units in this specialized domain between the languages under question.

Loading

Article metrics loading...

/content/journals/10.1075/term.00002.gha
2018-01-19
2025-02-18
Loading full text...

Full text loading...

References

  1. Abed, A. M. , S. Tiun , and M. Abared
    2013 “Arabic Term Extraction Using Combined Approach on Islamic Document.” Journal of Theoretical & Applied Information Technology58 (3): 601–608.
    [Google Scholar]
  2. Ahmad, K. , A. Davies , H. Fulford , and M. Rogers
    1994 “What is a term? The Semi-automatic Extraction of Terms from Text.” InTranslation Studies: An Interdiscipline, ed. by M. Snell-Hornby , F. Pöchhacker , and K. Kaindl , 267–278. Amsterdam: John Benjamins. doi: 10.1075/btl.2.33ahm
    https://doi.org/10.1075/btl.2.33ahm [Google Scholar]
  3. Almaany 2017www.almaany.com/. Accessed30 March 2017.
    [Google Scholar]
  4. Attia, M. , P. Pecina , A. Toral , L. Tounsi , and J. van Genabith
    2011 “A Lexical Database for Modern Standard Arabic Interoperable with a Finite State Morphological Transducer.” InProceedings Systems and Frameworks for Computational Morphology: Second International Workshop, SFCM 2011, Zurich, Switzerland, August 26, 2011, ed. by M. Cerstin and M. Piotrowski , 98–118. Zurich, Switzerland: Springer Berlin Heidelberg. doi: 10.1007/978‑3‑642‑23138‑4_7
    https://doi.org/10.1007/978-3-642-23138-4_7 [Google Scholar]
  5. 2011a “An Open-Source Finite State Morphological Transducer for Modern Standard Arabic.” InProceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing, 125–133. Blois, France: Association for Computational Linguistics.
    [Google Scholar]
  6. Attia, M. , P. Pecina , A. Toral , and J. van Genabith
    2013 “A Corpus-Based Finite-State Morphological Toolkit for Contemporary Arabic.” Journal of Logic and Computation24 (2): 455–472. doi: 10.1093/logcom/exs070
    https://doi.org/10.1093/logcom/exs070 [Google Scholar]
  7. Chung, T. M.
    2003 “A Corpus Comparison Approach for Terminology Extraction.” Terminology9 (2): 221–246. doi: 10.1075/term.9.2.05chu
    https://doi.org/10.1075/term.9.2.05chu [Google Scholar]
  8. Church, K. , and P. Hanks
    2002 “Word Association Norms, Mutual Information, and Lexicography.” Computational Linguistics16 (1): 22–29.
    [Google Scholar]
  9. Déjean, H. , and E. Gaussier
    2002 “Une nouvelle approche à l’extraction de lexiques bilingues à partir de corpus comparables.” InCorpus Linguistics: Critical Concepts in Linguistics, ed. by W. Teubert and R. Krishnamurthy , 1–22. New York: Routledge.
    [Google Scholar]
  10. DiCoInfo 2017olst.ling.umontreal.ca/cgi-bin/dicoinfo/search2.cgi?ui=fr. Accessed30 March 2017.
    [Google Scholar]
  11. Drouin, P.
    2002Acquisition automatique des termes: l’utilisation des pivots lexicaux spécialisés. Doctoral thesis. Université de Montréal.
    [Google Scholar]
  12. 2003 “Term Extraction Using Non-technical Corpora as a Point of Leverage.” Terminology9 (1): 99–115. doi: 10.1075/term.9.1.06dro
    https://doi.org/10.1075/term.9.1.06dro [Google Scholar]
  13. 2004 “Detection of Domain Specific Terminology Using Corpora Comparison.” InProceedings of the Fourth International Conference on Language Resources and Evaluation (LREC), 79–82. Lisbon, Portugal: ELRA – European Language Resources Association.
    [Google Scholar]
  14. Fung, P.
    1998 “A Statistical View on Bilingual Lexicon Extraction: from Parallel Corpora to Non-Parallel Corpora.” InThe 3rd Conference of the Association for Machine Translation in the Americas (AMTA’98), 1–17. Langhorne, PA, USA: Springer Berlin Heidelberg.
    [Google Scholar]
  15. Galisson, R
    1978Recherches de lexicologie descriptive: la banalisation lexicale. Paris: University of Montréal.
    [Google Scholar]
  16. Ghazzawi, N.
    2016Du terme prédicatif au cadre sémanrique: méthodologie de compilation d’une ressource terminologique pour les termes arabes de l’informatique. Doctoral thesis. University of Montréal.
    [Google Scholar]
  17. Guilbert, L.
    1973 “La spécificité du terme scientifique et technique.” Langue française (17): 5–17. doi: 10.3406/lfr.1973.5617
    https://doi.org/10.3406/lfr.1973.5617 [Google Scholar]
  18. Habash, N. , and F. Sadat
    2006 “Arabic Preprocessing Schemes for Statistical Machine Translation.” InProceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, 49–52. New York, USA: Association for Computational Linguistics. doi: 10.3115/1614049.1614062
    https://doi.org/10.3115/1614049.1614062 [Google Scholar]
  19. Habash, N. , O. Rambow , and R. Roth
    2009 “MADA+ TOKAN: A Toolkit for Arabic Tokenization, Diacritization, Morphological Disambiguation, POS Tagging, Stemming and Lemmatization.” InProceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), 102–109. Cairo, Egypt.
    [Google Scholar]
  20. Habash, N.
    2010 “Introduction to Arabic Natural Language Processing.” Synthesis Lectures on Human Language Technologies3 (1): 1–187. doi: 10.2200/S00277ED1V01Y201008HLT010
    https://doi.org/10.2200/S00277ED1V01Y201008HLT010 [Google Scholar]
  21. Kilgarriff, A.
    2001 “Comparing Corpora.” International Journal of Corpus Linguistics6 (1): 97–133. doi: 10.1075/ijcl.6.1.05kil
    https://doi.org/10.1075/ijcl.6.1.05kil [Google Scholar]
  22. Lafon, P.
    1980 “Sur la variabilité de la fréquence des formes dans un corpus.” Mot1 (1): 127–165. doi: 10.3406/mots.1980.1008
    https://doi.org/10.3406/mots.1980.1008 [Google Scholar]
  23. Lebart, L. , and A. Salem
    1994Statistique textuelle. Paris: Dunod.
    [Google Scholar]
  24. Lemay, C. , M.-C. L’Homme , and P. Drouin
    2005 “Two Methods for Extracting Specific Single-Word Terms from Specialized Corpora: Experimentation and Evaluation.” International Journal of Corpus Linguistics10 (2): 227–255. doi: 10.1075/ijcl.10.2.05lem
    https://doi.org/10.1075/ijcl.10.2.05lem [Google Scholar]
  25. L’Homme, M.-C.
    2004La terminologie: Principes et Techniques. Montréal, Canada: Les presses de l’université de Montréal.
    [Google Scholar]
  26. 2015 “Predicative Lexical Units in Terminology.” InRecent Advances in Language Production, Cognition and the Lexicon, ed. by N. Gala , R. Rappand , and G. Bel-Enguix , 75–93. Switzerland: Springer.
    [Google Scholar]
  27. Lorente, M.
    2007 “Les unitats lèxiques verbals dels textos especialitzats. Redefinició d’una proposta de classificació.” InEstudis de lingüístics i de lingüística aplicada en honor de M. Teresa Cabré Catellví. Volum II: De deixebles, ed. by M. Lorente , R. Estopà , J. Freixa , J. Martí , and C. Tebé , 365–380. Barcelona: Institut Universitari de Lingüística Aplicada de la Universitat Pompeu Fabra.
    [Google Scholar]
  28. Mel’čuk, I. , A. Clas , and A. Polguère
    1995Introduction à la lexicologie explicative et combinatoire. Louvain-la-Neuve: Duculot.
    [Google Scholar]
  29. Meyer, I.
    2000 “Computer Words in Our Everyday Lives: How are They Interesting for Terminography and Lexicography?” InProceedings of the Ninth EURALEX International Congress, EURALEX 2000, ed. by H. Ulrich , S. Evert , E. Lehmann , and C. Rohrer , 39–58. Stuttgart, Germany: Institut für Maschinelle Sprachverarbeitung.
    [Google Scholar]
  30. Meyer, I. and K. Mackintosh
    2000 “When terms move into our everyday lives: An overview of de-terminologization”. Terminology6(1), 111–138. doi: 10.1075/term.6.1.07mey
    https://doi.org/10.1075/term.6.1.07mey [Google Scholar]
  31. Monsonego, S.
    1969 “Ch. Muller: Étude de statistique lexicale. Le vocabulaire du théâtre de P. Corneille.” Langue française3 (1): 107–110.
    [Google Scholar]
  32. Muller, C.
    1967Étude de statistique lexicale, le vocabulaire du théâtre de Pierre Corneille. Paris: Larousse.
    [Google Scholar]
  33. 1977Principes et méthodes de statistique lexicale. Paris: Hachette.
    [Google Scholar]
  34. Nelson, M. B.
    2000 Corpus-based Study of the Lexis of Business English and Business English Teaching Materials. Unpublished Ph.D Thesis, University of Manchester, Manchester.
  35. Rapp, R.
    1999 “Automatic Identification of Word Translations from Unrelated English and German Corpora.” InProceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ed. by R. Dale and K. Church , 519–526. Stroudsburg, PA, USA: Association for Computational Linguistics. doi: 10.3115/1034678.1034756
    https://doi.org/10.3115/1034678.1034756 [Google Scholar]
  36. Rayson, P. , and R. Garside
    2000 “Comparing Corpora Using Frequency Profiling.” InProceedings of the workshop on Comparing Corpora, 1–6. Stroudsburg, PA, USA: Association for Computational Linguistics.
    [Google Scholar]
  37. Reppen, R.
    2001 “Review of MONOCONC PRO and WORDSMITH TOOLS.” Language Learning & Technology5 (3): 32–36.
    [Google Scholar]
  38. Rey, A.
    1979La terminologie: noms et notions. Coll. “Que sais-je ?”. Paris: Presses universitaires de France.
    [Google Scholar]
  39. Rondeau, G.
    1984Introduction à la terminologie. Chicoutimi, Québec: G. Morin.
    [Google Scholar]
  40. Sager, J. C.
    1990A Practical Course in Terminology Processing. Amsterdam: John Benjamins. doi: 10.1075/z.44
    https://doi.org/10.1075/z.44 [Google Scholar]
  41. Scott, M.
    1997 “PC Analysis of Key Words – and Key Key Words.” System25 (1): 233–345. doi: 10.1016/S0346‑251X(97)00011‑0
    https://doi.org/10.1016/S0346-251X(97)00011-0 [Google Scholar]
  42. Teubert, W.
    2009 “La linguistique de corpus: une alternative.” Semen. Revue de sémio-linguistique des textes et discours27: 185–211.
    [Google Scholar]
  43. Toutanova, K. , and C. Manning
    2000 “Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger.” InProceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), 63–70. Hong Kong: Association for Computational Linguistics.
    [Google Scholar]
  44. Toutanova, K. , D. Klein , C. D. Manning , and Y. Singer
    2003 “Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network.” InProceedings of HLT-NAACL, 173–180. Edmonton, Canada: Association for Computational Linguistics.
    [Google Scholar]
  45. Xu, F. , D. Kurz , J. Piskorski , and S. Schmeier
    2002 “A Domain Adaptive Approach to Automatic Acquisition of Domain Relevant Terms and their Relations with Bootstrapping.” InProceedings of the Third International Conference on Language Resources and Evaluation (LREC’02), ed. by M. González Rodríguez and C. Paz Suarez Araujo , 134–145. Las Palmas, Canary Islands, Spain: European Language Resources Association (ELRA).
    [Google Scholar]
/content/journals/10.1075/term.00002.gha
Loading
/content/journals/10.1075/term.00002.gha
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error