1887
Volume 20, Issue 2
  • ISSN 1598-7647
  • E-ISSN: 2451-909X
USD
Buy:$35.00 + Taxes

Abstract

Résumé

Les outils de traduction automatique (TA) neuronale ont fait des progrès sensibles, qui qui les rendent utilisables pour un nombre croissant de domaines et de couples de langues. Cette évolution majeure des technologies de traduction invite à revisiter les méthodes de mesure de la qualité de la traduction, en particulier des mesures dites automatiques, qui jouent un rôle fondamental pour orienter les nouveaux développements de ces systèmes. Dans cet article, nous dressons un état des lieux des méthodes utilisées dans le cycle de développement des outils de traduction automatique, depuis les évaluations purement quantitatives jusqu’aux méthodologies récemment proposées pour analyser et diagnostiquer le fonctionnement de ces “boites noires” neuronales.

Loading

Article metrics loading...

/content/journals/10.1075/forum.00023.yvo
2023-01-12
2024-10-11
Loading full text...

Full text loading...

References

  1. Bahdanau, Dzmitry, Kyunghyun Cho, et Yoshua Bengio
    2015 “Neural Machine Translation by Jointly Learning to Align and Translate.” InProceedings of the First International Conference on Learning Representations. San Diego, CA.
    [Google Scholar]
  2. Banerjee, Satanjeev et Alon Lavie
    2005 “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” InProceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation, 65–72. Ann Arbor, Michigan.
    [Google Scholar]
  3. Bawden, Rachel, Rico Sennrich, Alexandra Birch et Barry Haddow
    2018 “Evaluating Discourse Phenomena in Neural Machine Translation.” InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1304–13. New Orleans, Louisiana. 10.18653/v1/N18‑1118
    https://doi.org/10.18653/v1/N18-1118 [Google Scholar]
  4. Belinkov, Yonatan et Yonatan Bisk
    2018 “Synthetic and Natural Noise Both Break Neural Machine Translation.” InInternational Conference on Learning Representations.
    [Google Scholar]
  5. Belinkov, Yonatan et James Glass
    2019 “Analysis Methods in Neural Language Processing: A Survey.” Transactions of the Association for Computational Linguistics71 (April): 49–72. 10.1162/tacl_a_00254
    https://doi.org/10.1162/tacl_a_00254 [Google Scholar]
  6. Blanchon, Hervé, and Christian Boitet
    2007 “Pour l’évaluation Externe Des Systèmes de TA Par Des méthodes Fondées Sur La tâche.” Traitement Automatique Des Langues481: 33–65.
    [Google Scholar]
  7. Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari, Georg Heigold, Jan-Thorsten Peter, et Philip Williams
    2017 “A Linguistic Evaluation of Rule-Based, Phrase-Based, and Neural MT Engines.” The Prague Bulletin of Mathematical Linguistics1081: 159–70. 10.1515/pralin‑2017‑0017
    https://doi.org/10.1515/pralin-2017-0017 [Google Scholar]
  8. Burlot, Franck, et François Yvon
    2017 “Evaluating the Morphological Competence of Machine Translation Systems.” InProceedings of the Second Conference on Machine Translation, Volume 1: Research Papers, 43–55. Copenhagen, Denmark. 10.18653/v1/W17‑4705
    https://doi.org/10.18653/v1/W17-4705 [Google Scholar]
  9. 2018 “Evaluation morphologique pour la traduction automatique: adaptation au français.” InConférence sur le Traitement Automatique des Langues Naturelles, 14pages. TALN. Rennes, France.
    [Google Scholar]
  10. Castilho, Sheila, Stephen Doherty, Federico Gaspari, and Joss Moorkens
    2018 “Approaches to Human and Machine Translation Quality Assessment.” InTranslation Quality Assessment, 9–38. Springer. 10.1007/978‑3‑319‑91241‑7_2
    https://doi.org/10.1007/978-3-319-91241-7_2 [Google Scholar]
  11. Chatzikoumi, Eirini
    2020 “How to Evaluate Machine Translation: A Review of Automated and Human Metrics.” Natural Language Engineering26 (2): 137–61. 10.1017/S1351324919000469
    https://doi.org/10.1017/S1351324919000469 [Google Scholar]
  12. Cho, Kyunghyun, Bart van Merrienboer, Dzmitry Bahdanau, et Yoshua Bengio
    2014 “On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” InProceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, 103–11. Doha, Qatar. 10.3115/v1/W14‑4012
    https://doi.org/10.3115/v1/W14-4012 [Google Scholar]
  13. Conneau, Alexis, German Kruszewski, Guillaume Lample, Loı̈c Barrault, and Marco Baroni
    2018 “What You Can Cram into a Single $&!#* Vector: Probing Sentence Embeddings for Linguistic Properties.” InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2126–36. Melbourne, Australia. 10.18653/v1/P18‑1198
    https://doi.org/10.18653/v1/P18-1198 [Google Scholar]
  14. Forcada, Mikel L., Carolina Scarton, Lucia Specia, Barry Haddow, and Alexandra Birch
    2018 “Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires When Evaluating Machine Translation for Gisting.” InProceedings of the Third Conference on Machine Translation: Research Papers, 192–203. Brussels, Belgium. 10.18653/v1/W18‑6320
    https://doi.org/10.18653/v1/W18-6320 [Google Scholar]
  15. Freitag, Markus, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, et Wolfgang Macherey
    2021 “Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation.” Transactions of the Association for Computational Linguistics91: 1460–74. 10.1162/tacl_a_00437
    https://doi.org/10.1162/tacl_a_00437 [Google Scholar]
  16. Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, et Yann N. Dauphin
    2017 “Convolutional Sequence to Sequence Learning.” InProceedings of the 34th International Conference on Machine Learning, edited byD. Precup and Y. W. Teh, 701:1243–52. Sydney, Australia.l
    [Google Scholar]
  17. Giulianelli, Mario, Jack Harding, Florian Mohnert, Dieuwke Hupkes, et Willem Zuidema
    2018 “Under the Hood: Using Diagnostic Classifiers to Investigate and Improve How Language Models Track Agreement Information.” InProceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 240–48. Brussels, Belgium. 10.18653/v1/W18‑5426
    https://doi.org/10.18653/v1/W18-5426 [Google Scholar]
  18. Guillou, Liane, and Christian Hardmeier
    2016 “PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation.” InProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 636–43. Portorož, Slovenia.
    [Google Scholar]
  19. Guillou, Liane, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, and Andrei Popescu-Belis
    2016 “Findings of the 2016 WMT Shared Task on Cross-Lingual Pronoun Prediction.” InProceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 525–42. Berlin, Germany. 10.18653/v1/W16‑2345
    https://doi.org/10.18653/v1/W16-2345 [Google Scholar]
  20. Hardmeier, Christian, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley et Mauro Cettolo
    2015 “Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation.” InProceedings of the Second Workshop on Discourse in Machine Translation, 1–16. Lisbon, Portugal. 10.18653/v1/W15‑2501
    https://doi.org/10.18653/v1/W15-2501 [Google Scholar]
  21. Hewitt, John et Percy Liang
    2019 “Designing and Interpreting Probes with Control Tasks.” InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2733–43. Hong Kong, China. 10.18653/v1/D19‑1275
    https://doi.org/10.18653/v1/D19-1275 [Google Scholar]
  22. Hovy, Eduard, Margaret King et Andrei Popescu-Belis
    2002 “Principles of Context-Based Machine Translation Evaluation.” Machine Translation17 (1): 43–75. 10.1023/A:1025510524115
    https://doi.org/10.1023/A:1025510524115 [Google Scholar]
  23. Isabelle, Pierre, Colin Cherry, et George Foster
    2017 “A Challenge Set Approach to Evaluating Machine Translation.” InProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2486–96. Copenhagen, Denmark. 10.18653/v1/D17‑1263
    https://doi.org/10.18653/v1/D17-1263 [Google Scholar]
  24. King, Margaret et Kirsten Falkedal
    1990 “Using Test Suites in Evaluation of Machine Translation Systems.” InPapers Presented to the 13th International Conference on Computational Linguistics. COLING 1990. 10.3115/997939.997976
    https://doi.org/10.3115/997939.997976 [Google Scholar]
  25. Koehn, Philipp
    2010Statistical Machine Translation. Cambridge University Press.
    [Google Scholar]
  26. Krubiński, Mateusz, Erfan Ghadery, Marie-Francine Moens, and Pavel Pecina
    2021 “Just Ask! Evaluating Machine Translation by Asking and Answering Questions.” InProceedings of the Sixth Conference on Machine Translation, 495–506. Online.
    [Google Scholar]
  27. Kübler, Natalie
    2008 “A Comparable Learner Translator Corpus: Creation and Use.” InProc. Of LREC 2008 Workshop on Building and Using Comparable Corpora, 73–78. BUCC. Marrakech, Morocco.
    [Google Scholar]
  28. Läubli, Samuel, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, and Antonio Toral
    2020 “A Set of Recommendations for Assessing Human-Machine Parity in Language Translation.” Journal of Artificial Intelligence Review671: 653–72. 10.1613/jair.1.11371
    https://doi.org/10.1613/jair.1.11371 [Google Scholar]
  29. Lommel, Arle, Hans Uszkoreit, and Aljoscha Burchardt
    2014 “Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics.” Revista Tradumàtica: Tecnologies de La Traducció, no. 12: 455–63. 10.5565/rev/tradumatica.77
    https://doi.org/10.5565/rev/tradumatica.77 [Google Scholar]
  30. Maruf, Sameen, Fahimeh Saleh, and Gholamreza Haffari
    2021 “A Survey on Document-Level Neural Machine Translation: Methods and Evaluation.” ACM Comput. Surv. 54 (2). 10.1145/3441691
    https://doi.org/10.1145/3441691 [Google Scholar]
  31. Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu
    2002 “BLEU: A Method for Automatic Evaluation of Machine Translation.” InProceedings of the 40th Annual Meeting on Association for Computational Linguistics, 311–18. ACL ’02. Stroudsburg, PA, USA.
    [Google Scholar]
  32. Pierce, John R., John B. Carroll, Eric P. Hamp, David G. Hays, Charles F. Hockett, Anthony G. Oettinger, and Alan Perlis
    1966 “Language and Machines – Computers in Translation and Linguistics.” Washington, DC: ALPAC Report, National Academy of Sciences.
    [Google Scholar]
  33. Raganato, Alessandro, Yves Scherrer, and Jörg Tiedemann
    2019 “The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation.” InProceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 470–80. Florence, Italy. 10.18653/v1/W19‑5354
    https://doi.org/10.18653/v1/W19-5354 [Google Scholar]
  34. Rei, Ricardo, Craig Stewart, Ana C. Farinha, and Alon Lavie
    2020 “COMET: A Neural Framework for MT Evaluation.” InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2685–2702. Online. 10.18653/v1/2020.emnlp‑main.213
    https://doi.org/10.18653/v1/2020.emnlp-main.213 [Google Scholar]
  35. Rios, Annette, Mathias Müller, and Rico Sennrich
    2018 “The Word Sense Disambiguation Test Suite at WMT18.” InProceedings of the Third Conference on Machine Translation: Shared Task Papers, 588–96. Belgium, Brussels. 10.18653/v1/W18‑6437
    https://doi.org/10.18653/v1/W18-6437 [Google Scholar]
  36. Rudin, Cynthia
    2019 “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence1 (5): 206–15. 10.1038/s42256‑019‑0048‑x
    https://doi.org/10.1038/s42256-019-0048-x [Google Scholar]
  37. Saunders, Danielle, and Bill Byrne
    2020 “Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem.” Inof the 58th Annual Meeting of the Association for Computational Linguistics, 7724–36. Online. 10.18653/v1/2020.acl‑main.690
    https://doi.org/10.18653/v1/2020.acl-main.690 [Google Scholar]
  38. Scarton, Carolina, and Lucia Specia
    2016 “A Reading Comprehension Corpus for Machine Translation Evaluation.” InProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 3652–58. Portorož, Slovenia.
    [Google Scholar]
  39. Sennrich, Rico
    2017 “How Grammatical Is Character-Level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs.” InProceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 376–82. Valencia, Spain. 10.18653/v1/E17‑2060
    https://doi.org/10.18653/v1/E17-2060 [Google Scholar]
  40. Shi, Xing, Inkit Padhi, and Kevin Knight
    2016 “Does String-Based Neural MT Learn Source Syntax?” InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1526–34. Austin, Texas. 10.18653/v1/D16‑1159
    https://doi.org/10.18653/v1/D16-1159 [Google Scholar]
  41. Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, et John Makhoul
    2006 “A Study of Translation Edit Rate with Targeted Human Annotation.” InProceedings of the Seventh Conference of the Association for Machine Translation in the America (AMTA), 223–31. Boston, Massachusetts, USA.
    [Google Scholar]
  42. Specia, Lucia, Carolina Scarton, et Gustavo Henrique Paetzold
    2018Quality Estimation for Machine Translation. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. 10.1007/978‑3‑031‑02168‑8
    https://doi.org/10.1007/978-3-031-02168-8 [Google Scholar]
  43. Thompson, Brian et Matt Post
    2020 “Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing.” InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 90–121. Online. 10.18653/v1/2020.emnlp‑main.8
    https://doi.org/10.18653/v1/2020.emnlp-main.8 [Google Scholar]
  44. Vanmassenhove, Eva, Jinhua Du, and Andy Way
    2017 “Investigating ‘Aspect’ in NMT and SMT: Translating the English Simple Past and Present Perfect.” Computational Linguistics in the Netherlands Journal71: 109–28.
    [Google Scholar]
  45. Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, et Illia Polosukhin
    2017 “Attention Is All You Need.” InAdvances in Neural Information Processing Systems301, edited byI. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5998–6008.
    [Google Scholar]
  46. Vig, Jesse, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber
    . Investigating gender bias in language models using causal mediation analysis. InNeurIPS, volume331, pages12388–12401. Curran Associates, Inc. 2020.
    [Google Scholar]
  47. Voita, Elena and Ivan Titov
    . Information-theoretic probing with minimum description length. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages183–196, Online, November 2020Association for Computational Linguistics. 10.18653/v1/2020.emnlp‑main.14
    https://doi.org/10.18653/v1/2020.emnlp-main.14 [Google Scholar]
  48. Voita, Elena, Rico Sennrich, and Ivan Titov
    2019 “When a Good Translation Is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion.” InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1198–1212. Florence, Italy. 10.18653/v1/P19‑1116
    https://doi.org/10.18653/v1/P19-1116 [Google Scholar]
  49. Wisniewski, Guillaume, Lichao Zhou, Nicolas Ballier, et François Yvon
    2021 “Biais de genre dans un système de traduction automatique neuronale : une étude préliminaire.” InTraitement Automatique des Langues Naturelles, edité byP. Denis, N. Grabar, A. Fraisse, R. Cardon, B. Jacquemin, E. Kergosien, and A. Balvet, 11–25. Lille, France.
    [Google Scholar]
  50. Wisniewski, Guillaume, Lichao Zhu, Nicolas Ballier, et François Yvon
    2021 “Screening Gender Transfer in Neural Machine Translation.” InFourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Punta Cana, Dominica. 10.18653/v1/2021.blackboxnlp‑1.24
    https://doi.org/10.18653/v1/2021.blackboxnlp-1.24 [Google Scholar]
  51. Zhang, Tianyi, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, et Yoav Artzi
    2020 “BERTScore: Evaluating Text Generation with BERT.” InInternational Conference on Learning Representations.
    [Google Scholar]
/content/journals/10.1075/forum.00023.yvo
Loading
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error