1887
image of Automatic subtitles increase accuracy and decrease cognitive load in simultaneous interpreting
USD
Buy:$35.00 + Taxes

Abstract

Abstract

This study examines the effect of real-time subtitles generated by automatic speech recognition (ASR) technology on interpreting accuracy and interpreters’ cognitive load. Multiple measurements — including interpreting accuracy, the NASA-TLX for subjective ratings of cognitive load, eye-tracking and theta power as indicated by EEG recordings — were applied. Twenty-three professional simultaneous interpreters worked with a video recording of a speech presented in five conditions: a baseline without subtitles and then with subtitles of varying levels of precision (100%, 95%, 90% and 80%). The results reveal that the presence of subtitles significantly improved interpreting accuracy, with a suggested optimal precision rate of 90% or higher. The interpreters looked more at the subtitles, regardless of their level of precision, than the speaker. Contrary to our predictions, the presence of subtitles decreased, rather than increased, the cognitive load (although this outcome was shown by the EEG data only and not by the self-reported data). We conclude that the cognitive cost of processing subtitles as an additional information channel is offset by the cognitive gain achieved through visual prompting. The study highlights a complex effect of subtitles on interpreting, with such factors as subtitle presence and precision modulating the interpreters’ cognitive load in such a workflow.

Loading

Article metrics loading...

/content/journals/10.1075/intp.00111.li
2024-09-16
2024-10-06
Loading full text...

Full text loading...

References

  1. Abidi, O., Dženopoljac, V. & Safi, M.
    (2023) Online meeting tools, tacit knowledge sharing and entrepreneurial behaviours among knowledge workers during COVID-19. Knowledge Management Research & Practice (), –. 10.1080/14778238.2023.2261885
    https://doi.org/10.1080/14778238.2023.2261885 [Google Scholar]
  2. Albl-Mikasa, M.
    (2010) Global English and English as a lingua franca (ELF): Implications for the interpreting profession. Trans-Kom (), –. 10.21256/ZHAW‑4080
    https://doi.org/10.21256/ZHAW-4080 [Google Scholar]
  3. Alexander, M. P., Benson, D. F. & Stuss, D. T.
    (1989) Frontal lobes and language. Brain and Language (), –. 10.1016/0093‑934X(89)90118‑1
    https://doi.org/10.1016/0093-934X(89)90118-1 [Google Scholar]
  4. Amankwah-Amoah, J., Khan, Z., Wood, G. & Knight, G.
    (2021) COVID-19 and digitalization: The great acceleration. Journal of Business Research, –. 10.1016/j.jbusres.2021.08.011
    https://doi.org/10.1016/j.jbusres.2021.08.011 [Google Scholar]
  5. Baranowska, K.
    (2020) Learning most with least effort: Subtitles and cognitive load. ELT Journal (), –. 10.1093/elt/ccz060
    https://doi.org/10.1093/elt/ccz060 [Google Scholar]
  6. Boos, M., Kobi, M., Elmer, S. & Jäncke, L.
    (2022) The influence of experience on cognitive load during simultaneous interpretation. Brain and Language, 105185. 10.1016/j.bandl.2022.105185
    https://doi.org/10.1016/j.bandl.2022.105185 [Google Scholar]
  7. Castro-Meneses, L. J., Kruger, J.-L. & Doherty, S.
    (2020) Validating theta power as an objective measure of cognitive load in educational video. Educational Technology Research and Development (), –. 10.1007/s11423‑019‑09681‑4
    https://doi.org/10.1007/s11423-019-09681-4 [Google Scholar]
  8. Chen, S.
    (2017) The construct of cognitive load in interpreting and its measurement. Perspectives (), –. 10.1080/0907676X.2016.1278026
    https://doi.org/10.1080/0907676X.2016.1278026 [Google Scholar]
  9. Cheung, A. K. F.
    (2008) Simultaneous interpreting of numbers: An experimental study. Forum (), –. 10.1075/forum.6.2.02kfc
    https://doi.org/10.1075/forum.6.2.02kfc [Google Scholar]
  10. Cheung, A. K. F. & Li, T.
    (2022) Machine aided interpreting: An experiment of automatic speech recognition in simultaneous interpreting. Translation Quarterly (), –.
    [Google Scholar]
  11. Chiu, C.-C., Sainath, T. N., Wu, Y., Prabhavalkar, R., Nguyen, P., Chen, Z., Kannan, A., Weiss, R. J., Rao, K., Gonina, E., Jaitly, N., Li, B., Chorowski, J. & Bacchiani, M.
    (2018) State-of-the-art speech recognition with sequence-to-sequence models. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), –. 10.1109/ICASSP.2018.8462105
    https://doi.org/10.1109/ICASSP.2018.8462105 [Google Scholar]
  12. Chmiel, A., Janikowski, P. & Lijewska, A.
    (2020) Multimodal processing in simultaneous interpreting with text: Interpreters focus more on the visual than the auditory modality. Target (), –. 10.1075/target.18157.chm
    https://doi.org/10.1075/target.18157.chm [Google Scholar]
  13. CSIS
    CSIS (2017, 9September). Raila Odinga on the Kenyan elections. https://www.csis.org/events/raila-odinga-kenyan-elections
  14. D’Ausilio, A., Craighero, L. & Fadiga, L.
    (2012) The contribution of the frontal lobe to the perception of speech. Journal of Neurolinguistics (), –. 10.1016/j.jneuroling.2010.02.003
    https://doi.org/10.1016/j.jneuroling.2010.02.003 [Google Scholar]
  15. Defrancq, B. & Fantinuoli, C.
    (2021) Automatic speech recognition in the booth: Assessment of system performance, interpreters’ performances and interactions in the context of numbers. Target (), –. 10.1075/target.19166.def
    https://doi.org/10.1075/target.19166.def [Google Scholar]
  16. Dehaene, S., Cohen, L., Sigman, M. & Vinckier, F.
    (2005) The neural code for written words: A proposal. Trends in Cognitive Sciences (), –. 10.1016/j.tics.2005.05.004
    https://doi.org/10.1016/j.tics.2005.05.004 [Google Scholar]
  17. Delorme, A. & Makeig, S.
    (2004) EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods (), –. 10.1016/j.jneumeth.2003.10.009
    https://doi.org/10.1016/j.jneumeth.2003.10.009 [Google Scholar]
  18. Desmet, B., Vandierendonck, M. & Defrancq, B.
    (2018) Simultaneous interpretation of numbers and the impact of technological support. InC. Fantinuoli (Ed.), Interpreting and technology. Berlin: Language Science Press, –. 10.5281/ZENODO.1493291
    https://doi.org/10.5281/ZENODO.1493291 [Google Scholar]
  19. Díaz-Cintas, J.
    (2020) The name and nature of subtitling. InŁ. Bogucki & M. Deckert (Ed.), The Palgrave handbook of audiovisual translation and media accessibility. Cham: Palgrave Macmillan, –. 10.1007/978‑3‑030‑42105‑2_8
    https://doi.org/10.1007/978-3-030-42105-2_8 [Google Scholar]
  20. Díaz Cintas, J. & Remael, A.
    (2014) Audiovisual translation: Subtitling. Abingdon: Routledge. 10.4324/9781315759678
    https://doi.org/10.4324/9781315759678 [Google Scholar]
  21. ritella, F. M.
    (2021) CAI tool-supported SI of numbers: A theoretical and methodological contribution. International Journal of Interpreter Education (), –. 10.34068/ijie.14.01.05
    https://doi.org/10.34068/ijie.14.01.05 [Google Scholar]
  22. Fujimoto, M. & Kawai, H.
    (2019) One-pass single-channel noisy speech recognition using a combination of noisy and enhanced features. Interspeech 2019, –. 10.21437/Interspeech.2019‑1270
    https://doi.org/10.21437/Interspeech.2019-1270 [Google Scholar]
  23. Fuster, J. M.
    (2015) The prefrontal cortex (5th ed.). London: Academic Press. 10.1016/B978‑0‑12‑407815‑4.00002‑7
    https://doi.org/10.1016/B978-0-12-407815-4.00002-7 [Google Scholar]
  24. Gevins, A. & Smith, M. E.
    (2003) Neurophysiological measures of cognitive workload during human-computer interaction. Theoretical Issues in Ergonomics Science (), –. 10.1080/14639220210159717
    https://doi.org/10.1080/14639220210159717 [Google Scholar]
  25. Gile, D.
    (1999) Testing the Effort Model’s tightrope hypothesis in simultaneous interpreting — A contribution. Hermes (), –. 10.7146/hjlcb.v12i23.25553
    https://doi.org/10.7146/hjlcb.v12i23.25553 [Google Scholar]
  26. (2009) Basic concepts and models for interpreter and translator training (revised edition). Amsterdam: John Benjamins. 10.1075/btl.8
    https://doi.org/10.1075/btl.8 [Google Scholar]
  27. Grabner, R. H., Brunner, C., Leeb, R., Neuper, C. & Pfurtscheller, G.
    (2007) Event-related EEG theta and alpha band oscillatory responses during language translation. Brain Research Bulletin (), –. 10.1016/j.brainresbull.2007.01.001
    https://doi.org/10.1016/j.brainresbull.2007.01.001 [Google Scholar]
  28. Hart, S. G.
    (2006) Nasa-Task Load Index (NASA-TLX); 20 years later. Proceedings of the Human Factors and Ergonomics Society Annual Meeting (), –. 10.1177/154193120605000909
    https://doi.org/10.1177/154193120605000909 [Google Scholar]
  29. Johnson, E. B., Rees, E. M., Labuschagne, I., Durr, A., Leavitt, B. R., Roos, R. A. C., Reilmann, R., Johnson, H., Hobbs, N. Z., Langbehn, D. R., Stout, J. C., Tabrizi, S. J. & Scahill, R. I.
    (2015) The impact of occipital lobe cortical thickness on cognitive task performance: An investigation in Huntington’s Disease. Neuropsychologia, –. 10.1016/j.neuropsychologia.2015.10.033
    https://doi.org/10.1016/j.neuropsychologia.2015.10.033 [Google Scholar]
  30. Kafle, S. & Huenerfauth, M.
    (2016) Effect of speech recognition errors on text understandability for people who are deaf or hard of hearing. The 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2016), –. 10.21437/SLPAT.2016‑4
    https://doi.org/10.21437/SLPAT.2016-4 [Google Scholar]
  31. Kalina, S.
    (1992) Discourse processing and interpreting strategies — An approach to the teaching of interpreting. InC. Dollerup & A. Loddegaard (Ed.), Teaching translation and interpreting. Amsterdam: John Benjamins, –. 10.1075/z.56.38kal
    https://doi.org/10.1075/z.56.38kal [Google Scholar]
  32. Klimesch, W., Schack, B. & Sauseng, P.
    (2005) The functional significance of theta and upper alpha oscillations. Experimental Psychology (), –. 10.1027/1618‑3169.52.2.99
    https://doi.org/10.1027/1618-3169.52.2.99 [Google Scholar]
  33. Lee, S.-B.
    (2018) Exploring a relationship between students’ interpreting self-efficacy and performance: Triangulating data on interpreter performance assessment. The Interpreter and Translator Trainer (), –. 10.1080/1750399X.2017.1359763
    https://doi.org/10.1080/1750399X.2017.1359763 [Google Scholar]
  34. Lemhöfer, K. & Broersma, M.
    (2012) Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English. Behavior Research Methods (), –. 10.3758/s13428‑011‑0146‑0
    https://doi.org/10.3758/s13428-011-0146-0 [Google Scholar]
  35. Lenzo, K.
    (1993, 16September). The CMU pronouncing dictionary. www.speech.cs.cmu.edu/cgi-bin/cmudict?in=for
    [Google Scholar]
  36. Liao, S., Kruger, J.-L. & Doherty, S.
    (2020) The impact of monolingual and bilingual subtitles on visual attention, cognitive load, and comprehension. The Journal of Specialised Translation (), –.
    [Google Scholar]
  37. Lin, X.
    (2013) An empirical study on computer aided interpretation from English to ChineseMaster’s thesis, Shandong Normal University. https://kns.cnki.net/KCMS/detail/detail.aspx?dbcode=CMFD&dbname=CMFD201302&filename=1013215863.nh&v=
    [Google Scholar]
  38. Locke, E. A., Frederick, E., Lee, C. & Bobko, P.
    (1984) Effect of self-efficacy, goals, and task strategies on task performance. Journal of Applied Psychology (), –. 10.1037/0021‑9010.69.2.241
    https://doi.org/10.1037/0021-9010.69.2.241 [Google Scholar]
  39. Ludersdorfer, P., Kronbichler, M. & Wimmer, H.
    (2015) Accessing orthographic representations from speech: The role of left ventral occipitotemporal cortex in spelling. Human Brain Mapping (), –. 10.1002/hbm.22709
    https://doi.org/10.1002/hbm.22709 [Google Scholar]
  40. Mackintosh, J.
    (2003) The AIIC workload study. Forum (), –. 10.1075/forum.1.2.09mac
    https://doi.org/10.1075/forum.1.2.09mac [Google Scholar]
  41. Malakul, S. & Park, I.
    (2023) The effects of using an auto-subtitle system in educational videos to facilitate learning for secondary school students: Learning comprehension, cognitive load, and satisfaction. Smart Learning Environments (), . 10.1186/s40561‑023‑00224‑2
    https://doi.org/10.1186/s40561-023-00224-2 [Google Scholar]
  42. Mellinger, C. D. & Hanson, T. A.
    (2024, 15June). Cognitive load scales in CTIS: A systematic review. The Third Meeting of the Bertinoro Translation Society (BTS3), Bertinoro, Italy.
    [Google Scholar]
  43. Mognon, A., Jovicich, J., Bruzzone, L. & Buiatti, M.
    (2011) ADJUST: An automatic EEG artifact detector based on the joint use of spatial and temporal features. Psychophysiology (), –. 10.1111/j.1469‑8986.2010.01061.x
    https://doi.org/10.1111/j.1469-8986.2010.01061.x [Google Scholar]
  44. Nacimiento-García, E., González-González, C. S. & Gutiérrez-Vela, F. L.
    (2023) Automatic captions on video calls: A must for the older adults. Universal Access in the Information Society. 10.1007/s10209‑023‑01048‑0
    https://doi.org/10.1007/s10209-023-01048-0 [Google Scholar]
  45. Nomura, S., Mizuno, T., Nozawa, A., Asano, H. & Ide, H.
    (2009) Salivary cortisol as a new biomarker for a mild mental workload. 2009 International Conference on Biometrics and Kansei Engineering, –. 10.1109/ICBAKE.2009.32
    https://doi.org/10.1109/ICBAKE.2009.32 [Google Scholar]
  46. Orken, M., Dina, O., Keylan, A., Tolganay, T. & Mohamed, O.
    (2022) A study of transformer-based end-to-end speech recognition system for Kazakh language. Scientific Reports (), . 10.1038/s41598‑022‑12260‑y
    https://doi.org/10.1038/s41598-022-12260-y [Google Scholar]
  47. O’Sullivan, C. & Cornu, J.-F.
    (2018) History of audiovisual translation. InL. Pérez-González (Ed.), The Routledge handbook of audiovisual translation (pp.–). Abingdon: Routledge. 10.4324/9781315717166‑2
    https://doi.org/10.4324/9781315717166-2 [Google Scholar]
  48. Paas, F. G. W. C.
    (1992) Training strategies for attaining transfer of problem-solving skill in statistics: A cognitive-load approach. Journal of Educational Psychology (), –. 10.1037/0022‑0663.84.4.429
    https://doi.org/10.1037/0022-0663.84.4.429 [Google Scholar]
  49. Pisani, E. & Fantinuoli, C.
    (2021) Measuring the impact of automatic speech recognition on number rendition in simultaneous interpreting. InB. Zheng & C. Wang (Ed.), Empirical studies of translation and interpreting: The post-structuralist approach. Abingdon: Routledge, –. 10.4324/9781003017400‑14
    https://doi.org/10.4324/9781003017400-14 [Google Scholar]
  50. Pöchhacker, F.
    (2004) Introducing interpreting studies. London/New York: Routledge. 10.4324/9780203504802
    https://doi.org/10.4324/9780203504802 [Google Scholar]
  51. Prandi, B.
    (2018) An exploratory study on CAI tools in simultaneous interpreting: Theoretical framework and stimulus validation. InC. Fantinuoli (Ed.), Interpreting and technology. Berlin: Language Science Press, –. 10.5281/ZENODO.1493293
    https://doi.org/10.5281/ZENODO.1493293 [Google Scholar]
  52. Puma, S., Matton, N., Paubel, P.-V., Raufaste, É. & El-Yagoubi, R.
    (2018) Using theta and alpha band power to assess cognitive workload in multitasking environments. International Journal of Psychophysiology, –. 10.1016/j.ijpsycho.2017.10.004
    https://doi.org/10.1016/j.ijpsycho.2017.10.004 [Google Scholar]
  53. R Core Team
    R Core Team (2020) R: A language and environment for statistical computing. https://www.R-project.org/
    [Google Scholar]
  54. Rinne, J. O., Tommola, J., Laine, M., Krause, B. J., Schmidt, D., Kaasinen, V., Teräs, M., Sipilä, H. & Sunnari, M.
    (2000) The translating brain: Cerebral activation patterns during simultaneous interpreting. Neuroscience Letters (), –. 10.1016/S0304‑3940(00)01540‑8
    https://doi.org/10.1016/S0304-3940(00)01540-8 [Google Scholar]
  55. Romero-Fresco, P. & Eugeni, C.
    (2020) Live subtitling through respeaking. InŁ. Bogucki & M. Deckert (Ed.), The Palgrave handbook of audiovisual translation and media accessibility. Cham: Palgrave Macmillan, –. 10.1007/978‑3‑030‑42105‑2_14
    https://doi.org/10.1007/978-3-030-42105-2_14 [Google Scholar]
  56. Scott, B.
    (2003) Automatic readability checker. Readability Formulas. https://readabilityformulas.com/free-readability-formula-tests.php
    [Google Scholar]
  57. Seeber, K. G.
    (2011) Cognitive load in simultaneous interpreting: Existing theories — new models. Interpreting (), –. 10.1075/intp.13.2.02see
    https://doi.org/10.1075/intp.13.2.02see [Google Scholar]
  58. (2017) Multimodal processing in simultaneous interpreting. InJ. W. Schwieter & A. Ferreira (Ed.), The handbook of translation and cognition. Hoboken, NJ: Wiley, –. 10.1002/9781119241485.ch25
    https://doi.org/10.1002/9781119241485.ch25 [Google Scholar]
  59. Seeber, K. G., Keller, L. & Hervais-Adelman, A.
    (2020) When the ear leads the eye — the use of text during simultaneous interpretation. Language, Cognition and Neuroscience (), –. 10.1080/23273798.2020.1799045
    https://doi.org/10.1080/23273798.2020.1799045 [Google Scholar]
  60. Setton, R.
    (1999) Simultaneous interpretation: A cognitive-pragmatic analysis. Amsterdam: John Benjamins. 10.1075/btl.28
    https://doi.org/10.1075/btl.28 [Google Scholar]
  61. Stone, J. V.
    (2002) Independent component analysis: An introduction. Trends in Cognitive Sciences (), –. 10.1016/S1364‑6613(00)01813‑1
    https://doi.org/10.1016/S1364-6613(00)01813-1 [Google Scholar]
  62. Sun, H., Li, K. & Lu, J.
    (2021) AI-assisted simultaneous interpreting: An experiment and its implications. Technology Enhanced Foreign Language Education, –.
    [Google Scholar]
  63. Szarkowska, A. & Gerber-Morón, O.
    (2019) Two or three lines: A mixed-methods study on subtitle processing and preferences. Perspectives (), –. 10.1080/0907676X.2018.1520267
    https://doi.org/10.1080/0907676X.2018.1520267 [Google Scholar]
  64. The BBC Academy
    The BBC Academy (2022, July). BBC subtitle guidelines. https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/subtitles/
    [Google Scholar]
  65. Tran, Y., Craig, A., Craig, R., Chai, R. & Nguyen, H.
    (2020) The influence of mental fatigue on brain activity: Evidence from a systematic review with meta-analyses. Psychophysiology (). 10.1111/psyp.13554
    https://doi.org/10.1111/psyp.13554 [Google Scholar]
  66. Van Rossum, G. & Drake, F. L.
    (2009) Python 3 reference manual. CreateSpace.
    [Google Scholar]
  67. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. & Polosukhin, I.
    (2017) Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, –.
    [Google Scholar]
  68. Wang, C., Wu, Y., Lu, L., Liu, S., Li, J., Ye, G. & Zhou, M.
    (2020) Low latency end-to-end streaming speech recognition with a scout network. Interspeech 2020, –. 10.21437/Interspeech.2020‑1292
    https://doi.org/10.21437/Interspeech.2020-1292 [Google Scholar]
  69. Weiss, S. & Mueller, H. M.
    (2003) The contribution of EEG coherence to the investigation of language. Brain and Language (), –. 10.1016/S0093‑934X(03)00067‑1
    https://doi.org/10.1016/S0093-934X(03)00067-1 [Google Scholar]
  70. Williams, N. S., McArthur, G. M., de Wit, B., Ibrahim, G. & Badcock, N. A.
    (2020) A validation of Emotiv EPOC Flex saline for EEG and ERP research. PeerJ, e9713. 10.7717/peerj.9713
    https://doi.org/10.7717/peerj.9713 [Google Scholar]
  71. Yuan, L. & Wang, B.
    (2023) Cognitive processing of the extra visual layer of live captioning in simultaneous interpreting: Triangulation of eye-tracked process and performance data. Ampersand, 100131. 10.1016/j.amper.2023.100131
    https://doi.org/10.1016/j.amper.2023.100131 [Google Scholar]
  72. Zekveld, A. A., Kramer, S. E., Kessens, J. M., Vlaming, M. S. M. G. & Houtgast, T.
    (2009) The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system. Ear & Hearing (), –. 10.1097/AUD.0b013e3181987063
    https://doi.org/10.1097/AUD.0b013e3181987063 [Google Scholar]
  73. Zhang, Y., Qin, J., Park, D. S., Han, W., Chiu, C.-C., Pang, R., Le, Q. V. & Wu, Y.
    (2020) Pushing the limits of semi-supervised learning for automatic speech recognition. 10.48550/ARXIV.2010.10504
    https://doi.org/10.48550/ARXIV.2010.10504 [Google Scholar]
  74. Zhang, Z.
    (2019) Spectral and time-frequency analysis. InL. Hu & Z. Zhang (Ed.), EEG signal processing and feature extraction. Singapore: Springer, –. 10.1007/978‑981‑13‑9113‑2_6
    https://doi.org/10.1007/978-981-13-9113-2_6 [Google Scholar]
/content/journals/10.1075/intp.00111.li
Loading
/content/journals/10.1075/intp.00111.li
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error