Volume 34, Issue 3
  • ISSN 0924-1884
  • E-ISSN: 1569-9986
Buy:$35.00 + Taxes



One of the major barriers to the systematic study of indirect translation – that is, translations of translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.


Article metrics loading...

Loading full text...

Full text loading...


  1. Assis Rosa, Alexandra, Hanna Pięta, and Rita Bueno Maia
    2017 “Theoretical, Methodological and Terminological Issues Regarding Indirect Translation: An Overview.” Translation Studies10 (2): 113–132. 10.1080/14781700.2017.1285247
    https://doi.org/10.1080/14781700.2017.1285247 [Google Scholar]
  2. Baker, Mona
    1993 “Corpus Linguistics and Translation Studies – Implications and Applications.” InText and Technology: In Honour of John Sinclair, edited byMona Baker, Gill Francis, and Elena Tognini-Bonelli, 233–250. Amsterdam: John Benjamins. 10.1075/z.64.15bak
    https://doi.org/10.1075/z.64.15bak [Google Scholar]
  3. Baroni, Marco, and Silvia Bernardini
    2006 “A New Approach to the Study of Translationese: Machine-Learning the Difference between Original and Translated Text.” Literary and Linguistic Computing21 (3): 259–274. 10.1093/llc/fqi039
    https://doi.org/10.1093/llc/fqi039 [Google Scholar]
  4. Breiman, Leo
    2001 “Random Forests.” Machine Learning45 (1): 5–32. 10.1023/A:1010933404324
    https://doi.org/10.1023/A:1010933404324 [Google Scholar]
  5. Cartoni, Bruno, Sandrine Zufferey, and Thomas Meyer
    2013 “Using the Europarl Corpus for Cross-Linguistic Research.” Belgian Journal of Linguistics27 (1): 23–42. 10.1075/bjl.27.02car
    https://doi.org/10.1075/bjl.27.02car [Google Scholar]
  6. Čermák, František, and Alexandr Rosen
    2012 “The Case of InterCorp: A Multilingual Parallel Corpus.” International Journal of Corpus Linguistics17 (3): 411–427. 10.1075/ijcl.17.3.05cer
    https://doi.org/10.1075/ijcl.17.3.05cer [Google Scholar]
  7. Fernández Muñiz, Iris
    2016 “Tracking Sources in Indirect Translation Archaeology: A Case Study on a 1917 Spanish Translation of Ibsen’s Et Dukkehjem (1879).” InNew Horizons in Translation Research and Education4, edited byTuro Rautaoja, Tamara Mikolič Južnič, and Kaisa Koskinen, 115–132. Joensuu: University of Eastern Finland.
    [Google Scholar]
  8. Genette, Gérard
    1991 “Introduction to the Paratext.” New Literary History22 (2): 261–272. 10.2307/469037
    https://doi.org/10.2307/469037 [Google Scholar]
  9. Hanes, Vanessa Lopes Lourenço
    2017 “Between Continents: Agatha Christie’s Translations as Intercultural Mediators.” Cadernos de Tradução37 (1): 208–229. 10.5007/2175‑7968.2017v37n1p208
    https://doi.org/10.5007/2175-7968.2017v37n1p208 [Google Scholar]
  10. Islam, Zahurul, and Armin Hoenen
    2013 “Source and Translation Classification Using Most Frequent Words.” InProceedings of the Sixth International Joint Conference on Natural Language Processing, Nagoya, Japan, 14–18 October 2013, edited byRuslan Mitkov and Jong C. Park, 1299–1305. Nagoya: Asian Federation of Natural Language Processing.
    [Google Scholar]
  11. Ivaska, Ilmari, and Silvia Bernardini
    2020 “Constrained Language Use in Finnish: A Corpus-Driven Approach.” Nordic Journal of Linguistics43 (1): 33–57. 10.1017/S0332586520000013
    https://doi.org/10.1017/S0332586520000013 [Google Scholar]
  12. Ivaska, Laura
    2019 “Distinguishing Translations from Non-translations and Identifying (In)direct Translations’ Source Languages.” InProceedings of the Research Data and Humanities (RDHum) 2019 Conference: Data, Methods and Tools, edited byJarmo Harri Jantunen, Sisko Brunni, Niina Kunnas, Santeri Palviainen, and Katja Västi. Studia humaniora ouluensia 17, 125–138. Oulu: University of Oulu.
    [Google Scholar]
  13. 2020 “Identifying (Indirect) Translations and Their Source Languages in the Finnish National Bibliography Fennica: Problems and Solutions.” InMikaEL13: 75–88.
    [Google Scholar]
  14. 2021 “The Genesis of a Compilative Translation and its de facto Source Text.” InGenetic Translation Studies: Conflict and Collaboration in Liminal Spaces, edited byAriadne Nunes, Joana Moura, and Marta Pacheco Pinto, 72–88. London: Bloomsbury. 10.5040/9781350146846.ch‑005
    https://doi.org/10.5040/9781350146846.ch-005 [Google Scholar]
  15. Kanerva, Jenna, Filip Ginter, Niko Miekka, Akseli Leino, and Tapio Salakoski
    2018 “Turku Neural Parser Pipeline: An End-to-End System for the CoNLL 2018 Shared Task.” InProceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, edited byDaniel Zeman and Jan Hajič, 133–142. Brussels: Association for Computational Linguistics.
    [Google Scholar]
  16. Kazantzakis, Nikos
    1946Βίος και πολιτεία του Αλέξη Ζορμπά [Life and times of Alexis Zorbas]. Athens: Dimitrakou.
    [Google Scholar]
  17. 1952Zorba the Greek. Translated byCarl Wildman. New York: Simon and Schuster.
    [Google Scholar]
  18. 1954aAlexis Zorba. Translated byYvonne Gauthier, Gisèle Prassinos, and Pierre Fridas. Paris: Plon.
    [Google Scholar]
  19. 1954bKerro minulle, Zorbas [Tell me, Zorbas]. Translated byVappu Roos. Helsinki: Tammi.
    [Google Scholar]
  20. 1963Οι Αδερφοφάδες [The fratricides]. Athens: Unknown.
    [Google Scholar]
  21. 1964The Fratricides. Translated byAthena Gianakas Dallas. New York: Simon and Schuster.
    [Google Scholar]
  22. 1965Les frères ennemis [The enemy brothers]. Translated byPierre Aellig. Paris: Plon.
    [Google Scholar]
  23. 1967Veljesviha [Hatred of brothers]. Translated byKyllikkki Villa. Helsinki: Tammi.
    [Google Scholar]
  24. Koehn, Philipp
    2005 “Europarl: A Parallel Corpus for Statistical Machine Translation.” InProceedings of Machine Translation Summit X: Papers, 79–86. Phuket: Association for Computational Linguistics.
    [Google Scholar]
  25. Koppel, Moshe, and Noam Ordan
    2011 “Translationese and its Dialects.” InProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1, edited byDekang Lin, 1318–1326. Portland: Association for Computational Linguistics.
    [Google Scholar]
  26. Lynch, Gerard, and Carl Vogel
    2012 “Towards the Automatic Detection of the Source Language of a Literary Translation.” InProceedings of COLING 2012: Posters, edited byMartin Kay and Christian Boitet, 775–784. Mumbai: The COLING 2012 Organizing Committee.
    [Google Scholar]
  27. Mauranen, Anna
    2004 “Corpora, Universals and Interference.” InTranslation Universals: Do They Exist?edited byAnna Mauranen and Pekka Kujamäki, 65–82. Amsterdam: John Benjamins. 10.1075/btl.48.07mau
    https://doi.org/10.1075/btl.48.07mau [Google Scholar]
  28. Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch
    2021E1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071). TU Wien.
    [Google Scholar]
  29. Nisioi, Sergiu
    2015 “Unsupervised Classification of Translated Texts.” InNatural Language Processing and Information Systems, edited byChris Biemann, Siegfried Handschuh, André Freitas, Farid Meziane, and Elisabeth Métais, 323–334. Cham: Springer. 10.1007/978‑3‑319‑19581‑0_29
    https://doi.org/10.1007/978-3-319-19581-0_29 [Google Scholar]
  30. Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, and Daniel Zeman
    2020 “Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection.” InProceedings of 12th Conference on Language Resources and Evaluation LREC’2020, edited byNicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck , 4034–4043. Marseille: European Language Resources Association.
    [Google Scholar]
  31. Popescu, Marius
    2011 “Studying Translationese at the Character Level.” InProceedings of the International Conference Recent Advances in Natural Language Processing 2011, edited byRuslan Mitkov and Galia Angelova, 634–639. Hissar: Association for Computational Linguistics.
    [Google Scholar]
  32. R Core Team
    R Core Team 2021R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
    [Google Scholar]
  33. Rabinovich, Ella, Sergiu Nisioi, Noam Ordan, and Shuly Wintner
    2016 “On the Similarities between Native, Non-Native and Translated Texts.” InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics, edited byKatrin Erk and Noah A. Smith, 1870–1881. Berlin: Association for Computational Linguistics. 10.18653/v1/P16‑1176
    https://doi.org/10.18653/v1/P16-1176 [Google Scholar]
  34. Rabinovich, Ella, Noam Ordan, and Shuly Wintner
    2017 “Found in Translation: Reconstructing Phylogenetic Language Trees from Translations.” InProceedings of the 55th Annual Meeting of the Association for Computational Linguistics, edited byRegina Barzilay and Min-Yen Kan, 530–540. Vancouver: Association for Computational Linguistics. 10.18653/v1/P17‑1049
    https://doi.org/10.18653/v1/P17-1049 [Google Scholar]
  35. Rabinovich, Ella, and Shuly Wintner
    2015 “Unsupervised Identification of Translationese.” Transactions of the Association for Computational Linguistics3: 419–432. 10.1162/tacl_a_00148
    https://doi.org/10.1162/tacl_a_00148 [Google Scholar]
  36. Toury, Gideon
    2012Descriptive Translation Studies – and Beyond. Amsterdam: John Benjamins. 10.1075/btl.100
    https://doi.org/10.1075/btl.100 [Google Scholar]
  37. Ustaszewski, Michael
    2021 “Towards a Machine Learning Approach to the Analysis of Indirect Translation.” Translation Studies14 (3): 313–331. 10.1080/14781700.2021.1894226
    https://doi.org/10.1080/14781700.2021.1894226 [Google Scholar]
  38. Volansky, Vered, Noam Ordan, and Shuly Wintner
    2015 “On the Features of Translationese.” Digital Scholarship in the Humanities30 (1): 98–118. 10.1093/llc/fqt031
    https://doi.org/10.1093/llc/fqt031 [Google Scholar]
  39. Washbourne, Kelly
    2013 “Nonlinear Narratives: Paths of Indirect and Relay Translation.” Meta58 (3): 607–625. 10.7202/1025054ar
    https://doi.org/10.7202/1025054ar [Google Scholar]
  40. Wright, Marvin N., and Andreas Ziegler
    2017 “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software77 (1): 1–17. 10.18637/jss.v077.i01
    https://doi.org/10.18637/jss.v077.i01 [Google Scholar]
  41. Zei, Alki
    1971Ο μεγάλος περίπατος του Πέτρου [Petros’ long journey]. Athens: Kedros.
    [Google Scholar]
  42. 1972Petros’ War. Translated byEdward Fenton. New York: E. P. Dutton.
    [Google Scholar]
  43. 1973Tämä on sotaa, Petros [This is war, Petros]. Translated byMarikki Makkonen. Porvoo: WSOY.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error