1887
Volume 11, Issue 3
  • ISSN 2352-1805
  • E-ISSN: 2352-1813
USD
Buy:$35.00 + Taxes

Abstract

This article introduces a method to identify and classify translation equivalences in multilingual news texts and applies it to the task of creating a corpus for the study of news translation, a notably challenging area within Translation Studies. The dataset is composed of 41 Greek-English news dispatches on the topic of migration by AMNA, the Greek national news agency. Conceptually, we build on previous research on ‘comparallel’ corpus architectures, which bring together features of comparable and parallel corpora and provide the necessary flexibility to account for the non-prototypical translated data characterizing multilingual news. The automated method uses state-of-the art Natural Language Processing techniques, namely sentence and word embeddings, which make it possible to account for nuanced translation relationships, distinguishing between translated, partially translated, related, and unrelated sentence pairs. We test the method against a benchmark of manually annotated sentences from the AMNA dataset and provide examples of correctly and incorrectly classified sentence pairs. We finally build a fully-fledged comparallel corpus based on the dataset and present a case study demonstrating how the corpus can be leveraged for corpus-assisted studies of news discourse, and most notably to investigate newsworthiness and ideological shifts occurring in multilingual news.

Loading

Article metrics loading...

/content/journals/10.1075/ttmc.00171.fer
2025-08-19
2026-05-17
Loading full text...

Full text loading...

References

  1. Artetxe, Mikel, and Holger Schwenk
    2019 “Margin-Based Parallel Corpus Mining with Multilingual Sentence Embeddings.” InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3197–3203. Association for Computational Linguistics. 10.18653/v1/P19‑1309
    https://doi.org/10.18653/v1/P19-1309 [Google Scholar]
  2. Barrón-Cedeño, Alberto, Cristina España-Bonet, Josu Boldoba, and Lluís Màrquez
    2015 “A Factory of Comparable Corpora from Wikipedia.” InBUCC@ACL/IJCNLP3–13. Association for Computational Linguistics. 10.18653/v1/W15‑3402
    https://doi.org/10.18653/v1/W15-3402 [Google Scholar]
  3. Bassnett, Susan
    2005 “Bringing the News Back Home: Strategies of Acculturation and Foreignisation.” Language and Intercultural Communication5 (2): 120–130. 10.1080/14708470508668888
    https://doi.org/10.1080/14708470508668888 [Google Scholar]
  4. Baumann, Gerd, Marie Gillespie, and Annabelle Sreberny
    2011 “Transcultural Journalism and the Politics of Translation: Interrogating the BBC World Service.” Journalism12 (2): 135–142.
    [Google Scholar]
  5. Bednarek, Monika, and Helen Caple
    2017The Discourse of News Values: How News Organizations Create Newsworthiness. Oxford: Oxford University Press. 10.1093/acprof:oso/9780190653934.001.0001
    https://doi.org/10.1093/acprof:oso/9780190653934.001.0001 [Google Scholar]
  6. Bernardini, Silvia, Adriano Ferraresi, Federico Garcea, and Natalia Rodriguez-Blanco
    2024 “Corpus Approaches to News Translation: We Can Do Better Than Comparable!” Across Languages and Cultures25 (2): 198–215. 10.1556/084.2024.00905
    https://doi.org/10.1556/084.2024.00905 [Google Scholar]
  7. Bernardini, Silvia, Sara Castagnoli, Adriano Ferraresi, Federico Gaspari, and Eros Zanchetta
    2010 “Introducing Comparapedia: A New Resource for Corpus-Based Translation Studies.” Paper Presented at theUCCTS 2010 Conference, Edgehill University, UK.
    [Google Scholar]
  8. Bielsa, Esperança
    2007 “Translation in Global News Agencies.” Target19 (1): 135–155.
    [Google Scholar]
  9. Brook, Johnathan
    2012 The Role of Translation in the Production of International Print News. Three Case Studies in the Language Direction Spanish to English. PhD diss.University of Auckland.
  10. Caimotto, Maria Cristina, and Federico Gaspari
    2018 “Corpus-Based Study of News Translation: Challenges and Possibilities.” Across Languages and Cultures19 (2): 205–220.
    [Google Scholar]
  11. Carpenter, John C., and Sujatha Sosale
    2019 “The Role of Language in a Journalistic Interpretive Community.” Journalism Practice13 (3): 280–297. 10.1080/17512786.2018.1463865
    https://doi.org/10.1080/17512786.2018.1463865 [Google Scholar]
  12. Davier, Lucile, and Luc van Doorslaer
    2018 “Translation without a Source Text: Methodological Issues in News Translation.” Across Languages and Cultures19 (2): 241–257. 10.1556/084.2018.19.2.6
    https://doi.org/10.1556/084.2018.19.2.6 [Google Scholar]
  13. Davier, Lucile
    2014 “The Paradoxical Invisibility of Translation in the Highly Multilingual Context of News Agencies.” Global Media and Communication10 (1): 53–72. 10.1177/1742766513513196
    https://doi.org/10.1177/1742766513513196 [Google Scholar]
  14. 2021 “Translation in the News Agencies.” InThe Routledge Handbook of Translation and Media, ed. byEsperança Bielsa, 183–198. London: Routledge. 10.4324/9781003221678‑15
    https://doi.org/10.4324/9781003221678-15 [Google Scholar]
  15. 2022 “Translating News.” Inthe Cambridge Handbook of Translation, ed. byKirsten Malmkjær, 401–420. Cambridge: Cambridge University Press. 10.1017/9781108616119.021
    https://doi.org/10.1017/9781108616119.021 [Google Scholar]
  16. Federmann, Christian, Tom Kocmi, and Ying Xin
    2022 “NTREX-128 — News Test References for MT Evaluation of 128 Languages.” InProceedings of the First Workshop on Scaling up Multilingual Evaluation, 21–24. Association for Computational Linguistics. 10.18653/v1/2022.sumeval‑1.4
    https://doi.org/10.18653/v1/2022.sumeval-1.4 [Google Scholar]
  17. Feng, Fangxiaoyu, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, and Wei Wang
    2022 “Language-Agnostic BERT Sentence Embedding.” InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 878–891. Association for Computational Linguistics. 10.18653/v1/2022.acl‑long.62
    https://doi.org/10.18653/v1/2022.acl-long.62 [Google Scholar]
  18. Gaspari, Federico
    2013 “A Phraseological Comparison of International News Agency Reports Published Online: Lexical Bundles in the English-Language Output of ANSA, Adnkronos, Reuters and UPI.” Varieng. Studies in Variation, Contacts and Change in English13 (1). https://varieng.helsinki.fi/series/volumes/13/gaspari/
    [Google Scholar]
  19. 2015 “Exploring Expo Milano 2015: A Cross-Linguistic Comparison of Food-Related Phraseology in Translation Using a Comparallel Corpus Approach.” The Translator21 (3): 327–349. 10.1080/13556509.2015.1103099
    https://doi.org/10.1080/13556509.2015.1103099 [Google Scholar]
  20. Hernández-Guerrero, María José
    2022 “The Translation of Multimedia News Stories: Rewriting the Digital Narrative.” Journalism23 (7): 1488–1508.
    [Google Scholar]
  21. Holland, Robert
    2013 “News Translation.” InThe Routledge Handbook of Translation Studies, ed. byCarmen Millán, and Francesca Bartrina, 332–346. London: Routledge.
    [Google Scholar]
  22. Kontos, Petros, and Maria Sidiropoulou
    2012 “Socio-Political Narratives in Translated English-Greek News Headlines.” Intercultural Pragmatics9 (2): 195–224.
    [Google Scholar]
  23. Liu, Siyou, Longyue Wang, and Chao-Hong Liu
    2018 “Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts.” InProceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA).
    [Google Scholar]
  24. Panou, Despoina
    2014Idiom Translation in the Financial Press: A Corpus-Based Study. Newcastle upon Tyne: Cambridge Scholars Publishing.
    [Google Scholar]
  25. Pęzik, Piotr, and Łukasz Grabowski
    2023 “Towards a Near-Parallel Corpus of News Texts: An Experiment in Using Multilingual Sentence Embeddings.” Paper presented at thePACOR 2023 Conference, University of León, Spain.
    [Google Scholar]
  26. Rodriguez-Blanco, Natalia
    2024a “Distance and Closeness in Translated Global News Coverage: Bilingual Representations of Culture-bound themes from Bolivia to the World.” Perspectives1–19. 10.1080/0907676X.2023.2299709
    https://doi.org/10.1080/0907676X.2023.2299709 [Google Scholar]
  27. 2024b Translational and Discursive Processes in Multilingual News Production by Global News Agencies: Representations about Bolivia. PhD diss., University of Bologna.
  28. Schäffner, Christina
    2010Political Discourse, Media and Translation. Newcastle upon Tyne: Cambridge Scholars Publishing.
    [Google Scholar]
  29. Sharjeel, Muhammad, Iqra Muneer, Sumaira Nosheen, Rao Nawab, Adeel Muhammad, and Paul Rayson
    2023 “Cross-lingual Text Reuse Detection at Document Level for English-Urdu Language Pair.” ACM Transactions on Asian and Low-Resource Language Information Processing22 (6): 173:1–173:22. 10.1145/3592761
    https://doi.org/10.1145/3592761 [Google Scholar]
  30. Sidiropoulou, Maria
    2020 “Introduction: Im/Politeness and Theatre Translation.” Translation and Translanguaging in Multilingual Contexts6 (1): 1–8. 10.1075/ttmc.00040.sid
    https://doi.org/10.1075/ttmc.00040.sid [Google Scholar]
  31. Valdeón, Roberto A.
    2015 “Fifteen Years of Journalistic Translation Research and More.” Perspectives23 (4): 634–662. 10.1080/0907676X.2015.1057187
    https://doi.org/10.1080/0907676X.2015.1057187 [Google Scholar]
  32. 2020 “On the Interface between Journalism and Translation Studies: A Historical Overview and Suggestions for Collaborative Research.” Journalism Studies21 (12): 1644–1661.
    [Google Scholar]
  33. 2022 “Interdisciplinary Approaches to Journalistic Translation.” Journalism23 (7): 1397–1410.
    [Google Scholar]
  34. Vamvas, Jannis, and Rico Sennrich
    2022 “NMTScore: A Multilingual Analysis of Translation-Based Text Similarity Measures.” InFindings of the Association for Computational Linguistics: EMNLP 2022, 198–213. Association for Computational Linguistics. 10.18653/v1/2022.findings‑emnlp.15
    https://doi.org/10.18653/v1/2022.findings-emnlp.15 [Google Scholar]
  35. Van Doorslaer, Luc
    2010 “The Double Extension of Translation in the Journalistic Field.” Across Languages and Cultures11 (2): 175–188. 10.1556/Acr.11.2010.2.3
    https://doi.org/10.1556/Acr.11.2010.2.3 [Google Scholar]
  36. Wołk, Krzysztof, Emilia Rejmund, and Krzysztof Marasek
    2015 “Harvesting Comparable Corpora and Mining Them for Equivalent Bilingual Sentences Using Statistical Classification and Analogy-Based Heuristics.” InInternational Symposium on Methodologies for Intelligent Systems (ISMIS 2015), ed. byFloriana Esposito, Olivier Pivert, Mohand-Saïd Hacid, Zbigniew Rás, and Stefano Ferilli, 433–441. Berlin: Springer International Publishing. 10.1007/978‑3‑319‑25252‑0_46
    https://doi.org/10.1007/978-3-319-25252-0_46 [Google Scholar]
/content/journals/10.1075/ttmc.00171.fer
Loading
/content/journals/10.1075/ttmc.00171.fer
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error