1887
Volume 26, Issue 1
  • ISSN 1384-6647
  • E-ISSN: 1569-982X
USD
Buy:$35.00 + Taxes

Abstract

Abstract

This study introduces a groundbreaking automated methodology for measuring ear–voice span (EVS) in simultaneous interpreting (SI). Traditionally, assessing EVS – a critical temporal metric in SI – has been hampered by labour-intensive and time-consuming manual methods that are prone to inconsistency. To overcome these challenges, our research harnesses state-of-the-art natural language processing (NLP) technologies, including automatic speech recognition (ASR), sentence boundary detection (SBD) and cross-lingual alignment, to automate EVS measurement. We deployed a comprehensive array of NLP models and evaluated the automated pipelines on a 20-hour English-to-Portuguese SI corpus which featured 57 varied audio pairings. The findings are encouraging: the most effective model combination achieved a median EVS error of less than 0.1 seconds across the corpus. Moreover, the automated pipelines exhibited a high level of accuracy, strong correlation and substantial agreement with manual measurements when assessing median EVS for individual audio pairs. Despite these satisfactory results, certain challenges persist with some NLP models, indicating clear avenues for future research. This study not only introduces a groundbreaking approach to large-scale EVS measurement but also propels the automation of process analysis in Interpreting Studies.

Loading

Article metrics loading...

/content/journals/10.1075/intp.00100.guo
2024-01-15
2024-10-15
Loading full text...

Full text loading...

References

  1. Artetxe, M. & Schwenk, H.
    (2019) Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics71, 597–610. 10.1162/tacl_a_00288
    https://doi.org/10.1162/tacl_a_00288 [Google Scholar]
  2. Baevski, A., Zhou, H., Mohamed, A. & Auli, M.
    (2020) Wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv. 10.48550/arXiv.2006.11477
    https://doi.org/10.48550/arXiv.2006.11477 [Google Scholar]
  3. Bain, M., Huh, J., Han, T. & Zisserman, A.
    (2023, March1). WhisperX: Time-accurate speech transcription of long-form audio. arXiv. CitetononCRdoi:10.48550/arXiv.2303.00747
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.2303.00747 [Google Scholar]
  4. Barik, H. C.
    (1973) Simultaneous interpretation: Temporal and quantitative data. Language and Speech16 (3), 237–270. 10.1177/002383097301600307
    https://doi.org/10.1177/002383097301600307 [Google Scholar]
  5. Bendazzoli, C. & Sandrelli, A.
    (2005) An approach to corpus-based interpreting studies: Developing EPIC (European Parliament Interpreting Corpus). Proceedings of the EU-HighLevel Scientific Conference Series MuTra 2005 – Challenges of Multidimensional Translation. www.euroconferences.info/proceedings/2005_Proceedings/2005_Bendazzoli_Sandrelli.pdf
    [Google Scholar]
  6. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D.
    (2020) Language models are few-shot learners. arXiv. CitetononCR doi:10.48550/arXiv.2005.14165
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.2005.14165 [Google Scholar]
  7. Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., John, R. S., … Kurzweil, R.
    (2018) Universal Sentence Encoder. arXiv. CitetononCRdoi: 10.48550/arXiv.1803.11175
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.1803.11175 [Google Scholar]
  8. Chmiel, A., Janikowski, P. & Cieślewicz, A.
    (2020) The eye or the ear? Source language interference in sight translation and simultaneous interpreting: Interpreting22 (2), 187–210. 10.1075/intp.00043.chm
    https://doi.org/10.1075/intp.00043.chm [Google Scholar]
  9. Chmiel, A., Janikowski, P., Koržinek, D., Lijewska, A., Kajzer-Wietrzny, M., Jakubowski, D. & Plevoets, K.
    (2023) Lexical frequency modulates current cognitive load, but triggers no spillover effect in interpreting. Perspectives. 10.1080/0907676X.2023.2218553
    https://doi.org/10.1080/0907676X.2023.2218553 [Google Scholar]
  10. Chmiel, A., Koržinek, D., Kajzer-Wietrzny, M., Janikowski, P., Jakubowski, D. & Polakowska, D.
    (2022) Fluency parameters in the Polish Interpreting Corpus (PINC). InM. Kajzer-Wietrzny, A. Ferraresi, I. Ivaska & Bernardini (Eds.), Mediated discourse at the European Parliament empirical investigations. Berlin: Language Science Press, 63–91.
    [Google Scholar]
  11. Chmiel, A., Szarkowska, A., Koržinek, D., Lijewska, A., Dutka, Ł., Brocki, Ł. & Marasek, K.
    (2017) Ear–voice span and pauses in intra- and interlingual respeaking: An exploratory study into temporal aspects of the respeaking process. Applied Psycholinguistics38 (5), 1201–1227. 10.1017/S0142716417000108
    https://doi.org/10.1017/S0142716417000108 [Google Scholar]
  12. Christoffels, I. K., & de Groot, A. M. B.
    (2004) Components of simultaneous interpreting: Comparing interpreting with shadowing and paraphrasing. Bilingualism: Language and Cognition7 (3), 227–240. 10.1017/S1366728904001609
    https://doi.org/10.1017/S1366728904001609 [Google Scholar]
  13. Cokely, D.
    (1986) The effects of lag time on interpreter errors. Sign Language Studies531, 341–375. 10.1353/sls.1986.0025
    https://doi.org/10.1353/sls.1986.0025 [Google Scholar]
  14. Collard, C. & Defrancq, B.
    (2019) Predictors of ear-voice span, a corpus-based study with special reference to sex. Perspectives27 (3), 431–454. 10.1080/0907676X.2018.1553199
    https://doi.org/10.1080/0907676X.2018.1553199 [Google Scholar]
  15. Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., … Stoyanov, V.
    (2020) Unsupervised cross-lingual representation learning at scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, 8440–8451. 10.18653/v1/2020.acl‑main.747
    https://doi.org/10.18653/v1/2020.acl-main.747 [Google Scholar]
  16. Conneau, A., Lample, G., Ranzato, M., Denoyer, L. & Jégou, H.
    (2018) Word translation without parallel data. arXiv. CitetononCR doi:10.48550/arXiv.1710.04087
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.1710.04087 [Google Scholar]
  17. Davis, K. H., Biddulph, R. & Balashek, S.
    (1952) Automatic recognition of spoken digits. The Journal of the Acoustical Society of America24 (6), 637–642. 10.1121/1.1906946
    https://doi.org/10.1121/1.1906946 [Google Scholar]
  18. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K.
    (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota: Association for Computational Linguistics, 4171–4186. 10.18653/v1/N19‑1423
    https://doi.org/10.18653/v1/N19-1423 [Google Scholar]
  19. Díaz-Galaz, S., Padilla, P. & Bajo, M. T.
    (2015) The role of advance preparation in simultaneous interpreting: A comparison of professional interpreters and interpreting students. Interpreting17 (1), 1–25. 10.1075/intp.17.1.01dia
    https://doi.org/10.1075/intp.17.1.01dia [Google Scholar]
  20. Gerver, D.
    (1976) Empirical studies of simultaneous interpretation: A review and a model. InR. Brislin (Ed.), Translation: Applications and research. New York: Gardner Press, 165–207.
    [Google Scholar]
  21. Gile, D.
    (2009) Basic concepts and models for interpreter and translator training (Rev. ed.). Amsterdam: John Benjamins. 10.1075/btl.8
    https://doi.org/10.1075/btl.8 [Google Scholar]
  22. Gonga, A. A. N. G., Crasborn, O. A., Börstell, C. A. & Ormel, E. A.
    (2020) Comparing IS and NGT interpreting processing time. A case study. InC. McDermid, S. Ehrlich, & A. Gentry (Eds.), Proceedings of WASLI 2019. Geneva: WASLI, 74–95.
    [Google Scholar]
  23. Gumul, E.
    (2006) Conjunctive cohesion and the length of Ear-Voice Span in simultaneous interpreting. Linguistica Silesiana271, 93–103.
    [Google Scholar]
  24. Han, H.-H. & Yu, H.-N.
    (2020) An empirical study of temporal variables and their correlations in spoken and sign language relay interpreting. Babel66 (4–5), 619–635. 10.1075/babel.00191.yu
    https://doi.org/10.1075/babel.00191.yu [Google Scholar]
  25. Hsu, W.-N., Sriram, A., Baevski, A., Likhomanenko, T., Xu, Q., Pratap, V., … Auli, M.
    (2021) Robust wav2vec 2.0: Analyzing domain shift in self-supervised pre-training. arXiv. CitetononCRdoi:10.48550/arXiv.2104.01027
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.2104.01027 [Google Scholar]
  26. Jurafsky, D. & Martin, J. H.
    (2000) Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. USA: Prentice Hall PTR.
    [Google Scholar]
  27. Kiss, T. & Strunk, J.
    (2006) Unsupervised multilingual sentence boundary detection. Computational Linguistics32 (4), 485–525. 10.1162/coli.2006.32.4.485
    https://doi.org/10.1162/coli.2006.32.4.485 [Google Scholar]
  28. Lamberger-Felber, H.
    (2017) Text-oriented research into interpreting – Examples from a case-study. HERMES14 (26), 39–64. 10.7146/hjlcb.v14i26.25638
    https://doi.org/10.7146/hjlcb.v14i26.25638 [Google Scholar]
  29. Manning, C. D. & Schütze, H.
    (1999) Foundations of statistical Natural Language Processing. Cambridge, Mass: The MIT Press.
    [Google Scholar]
  30. Mellinger, C. D. & Hanson, T.
    (2017) Quantitative research methods in translation and interpreting studies. London and New York: Routledge.
    [Google Scholar]
  31. Montani, I., Honnibal, M., Honnibal, M., Landeghem, S. V., Boyd, A., Peters, H., … Tamura, Y.
    (2023) explosion/spaCy: V3.5.2: Pretraining improvements, bug fixes for spans and spancat and more. Zenodo. CitetononCRdoi: 10.5281/zenodo.7820813
    https://doi.org/Cite to nonCR doi: 10.5281/zenodo.7820813 [Google Scholar]
  32. Paneth, E.
    (1957) An investigation into conference interpreting. InF. Pöchhacker & M. Shlesinger (Eds.), The interpreting studies reader. New York: University of London/Routledge, 30–40.
    [Google Scholar]
  33. Plevoets, K. & Defrancq, B.
    (2018) The cognitive load of interpreters in the European Parliament. A corpus-based study of predictors for the disfluency uh(m). Interpreting20 (1), 1–28. 10.1075/intp.00001.ple
    https://doi.org/10.1075/intp.00001.ple [Google Scholar]
  34. (2020) Imported load in simultaneous interpreting: An assessment. InMultilingual mediated communication and cognition. London: Routledge, 18–43. 10.4324/9780429323867‑2
    https://doi.org/10.4324/9780429323867-2 [Google Scholar]
  35. Pöchhacker, F.
    (2016) Introducing interpreting studies (2nd ed.). London: Routledge. 10.4324/9781315649573
    https://doi.org/10.4324/9781315649573 [Google Scholar]
  36. Prandi, B.
    (2023) Computer-assisted simultaneous interpreting: A cognitive-experimental study on terminology. Berlin: Language Science Press.
    [Google Scholar]
  37. Qi, P., Zhang, Y., Zhang, Y., Bolton, J. & Manning, C. D.
    (2020) Stanza: A Python Natural Language Processing toolkit for many human languages. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Online: Association for Computational Linguistics, 101–108. 10.18653/v1/2020.acl‑demos.14
    https://doi.org/10.18653/v1/2020.acl-demos.14 [Google Scholar]
  38. Rabiner, L. R.
    (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 771, 257–286. 10.1109/5.18626
    https://doi.org/10.1109/5.18626 [Google Scholar]
  39. Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C. & Sutskever, I.
    (2022) Robust speech recognition via large-scale weak supervision. arXiv. CitetononCRdoi:10.48550/arXiv.2212.04356
    https://doi.org/Cite to nonCR doi: 10.48550/arXiv.2212.04356 [Google Scholar]
  40. Read, J., Dridan, R., Oepen, S. & Solberg, L. J.
    (2012) Sentence boundary detection: A long solved problem?Proceedings of COLING 2012: Posters. Mumbai, India: The COLING 2012 Organizing Committee, 985–994.
    [Google Scholar]
  41. Reimers, N. & Gurevych, I.
    (2019) Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3982–3992. Hong Kong, China: Association for Computational Linguistics. 10.18653/v1/D19‑1410
    https://doi.org/10.18653/v1/D19-1410 [Google Scholar]
  42. Rosendo, L. R. & Galván, M. C.
    (2019) Coping with speed. Babel65 (1), 1–25. 10.1075/babel.00081.rui
    https://doi.org/10.1075/babel.00081.rui [Google Scholar]
  43. Ruder, S., Vulić, I. & Søgaard, A.
    (2019) A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research651, 569–631. 10.1613/jair.1.11640
    https://doi.org/10.1613/jair.1.11640 [Google Scholar]
  44. Temnikova, I., Abdelali, A., Djabri, S. & Hedaya, S.
    (2019) Human-informed speakers and interpreters analysis in the WAW corpus and an automatic method for calculating interpreters’ décalage. Proceedings of the Human-informed Translation and Interpreting Technology Workshop (HiT-IT 2019), 105–115. 10.26615/issn.2683‑0078.2019_013
    https://doi.org/10.26615/issn.2683-0078.2019_013 [Google Scholar]
  45. Tiedemann, J. & Thottingal, S.
    (2020) OPUS-MT – Building open translation services for the World. Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. Lisboa: European Association for Machine Translation, 479–480.
    [Google Scholar]
  46. Timarová, Š.
    (2015) Time lag. InF. Pӧchhacker (Ed.), Routledge encyclopedia of interpreting studies. London: Routledge, 418–420.
    [Google Scholar]
  47. Timarová, Š., Čeňková, I., Meylaerts, R., Hertog, E., Szmalec, A. & Duyck, W.
    (2014) Simultaneous interpreting and working memory executive control. Interpreting16 (2), 139–168. 10.1075/intp.16.2.01tim
    https://doi.org/10.1075/intp.16.2.01tim [Google Scholar]
  48. Timarová, Š., Dragsted, B. & Gorm Hansen, I.
    (2011) Time lag in translation and interpreting: A methodological exploration. InC. Alvstad, A. Hild & E. Tiselius (Eds.), Methods and strategies of process research: Integrative approaches in Translation Studies. John Benjamins, 121–146. 10.1075/btl.94.10tim
    https://doi.org/10.1075/btl.94.10tim [Google Scholar]
  49. Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., … Raffel, C.
    (2021) mT5: A massively multilingual pre-trained Text-to-Text Transformer. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 483–498. Association for Computational Linguistics. 10.18653/v1/2021.naacl‑main.41
    https://doi.org/10.18653/v1/2021.naacl-main.41 [Google Scholar]
  50. Zhang, W., Feng, Y., Meng, F., You, D. & Liu, Q.
    (2019) Bridging the gap between training and inference for Neural Machine Translation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 4334–4343. Florence: Association for Computational Linguistics. 10.18653/v1/P19‑1426
    https://doi.org/10.18653/v1/P19-1426 [Google Scholar]
/content/journals/10.1075/intp.00100.guo
Loading
/content/journals/10.1075/intp.00100.guo
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error