1887
Volume 5, Issue 1
  • ISSN 2799-6190
  • E-ISSN: 2799-8592

Abstract

Abstract

This study examines the use of Artificial Intelligence Generated Content (AIGC) tools for assessing speech difficulty in interpreter training. 25 students were invited to interpret three materials from English into Chinese consecutively and then evaluate the difficulty levels of those speeches, while ChatGPT was provided with the transcripts and the duration of the speeches. Speech evaluations by students were compared to those made by ChatGPT within a standardized framework, the Speech Difficulty Index (SDI). Statistical analysis, specifically one-sample t-tests and one-sample Wilcoxon signed rank tests, were conducted to determine any significant differences between the assessments of students and ChatGPT. As for the total scores, the results indicate a consensus between students and ChatGPT on the difficulty of a moderately challenging speech. However, divergences were observed for the other two speeches classified as more or less difficult. Further comparison of the scores on three breakdown dimensions indicates that students’ evaluation can differ from that of ChatGPT in “Subject Matter”, while there is no significant difference in the scores of “Speed of Delivery”. As for “Density and Style,” the trend is consistent with the one shown in the total scores’ comparison. A following interview presents students’ perspectives on evaluating speech difficulty, with their subjective perceptions as standards to form judgements. Given ChatGPT’s capabilities to analyze delivery speed and minimize subjective biases, the integration of AIGC tools in educational settings is recommended. Moreover, interpreter trainers should notice the divergence and balance between the subjective perception among students and the objective evaluation of speech difficulty, to complement the ignorance of AIGC tools on subjective factors. By providing AIGC tools with reliable frameworks for speech difficulty evaluation, it could refine material selection, ensuring a better alignment with learners’ proficiency levels, thereby optimizing the educational outcomes of interpreter training. Based on the findings and limitations in this study, several promising aspects for future research are proposed.

Available under the CC BY-NC-ND 4.0 license.
Loading

Article metrics loading...

/content/journals/10.54754/incontext.v5i1.104
2025-05-31
2026-04-19
Loading full text...

Full text loading...

References

  1. AlKhuzaey, Samah, Floriana Grasso, Terry R. Payne and Valentina Tamma
    (2024) Text-based Question Difficulty Prediction: A Systematic Review of Automatic Approaches. International Journal of Artificial Intelligence in Education, 34(3), 862–914. 10.1007/s40593‑023‑00362‑1
    https://doi.org/10.1007/s40593-023-00362-1 [Google Scholar]
  2. Andres, Dörte
    (2015) Easy? Medium? Hard? The importance of text selection in interpreter training. InDörte Andres & Martina Behr (Eds.), To Know How to Suggest: Approaches to Teaching Conference Interpreting (pp.103–124). Frank & Timme.
    [Google Scholar]
  3. Bendazzoli, Claudio and Annalisa Sandrelli
    (2005, May2–6). An approach to corpus-based interpreting studies: Developing EPIC (European Parliament Interpreting Corpus). Proceedings of theMarie Curie Euroconferences MuTra: Challenges of Multidimensional Translation, Saarbrücken, Germany.
    [Google Scholar]
  4. Benedetto, Luca, Andrea Cappelli, Roberto Turrin and Paolo Cremonesi
    (2020, March23–27). R2DE: a NLP approach to estimating IRT parameters of newly generated questions. Proceedings of theTenth International Conference on Learning Analytics & Knowledge, Frankfurt, Germany.
    [Google Scholar]
  5. Calvo-Ferrer, José Ramón
    (2023) Can you tell the difference? A study of human vs machine-translated subtitles. Perspectives, 32(6), 1115–1132. 10.1080/0907676X.2023.2268149
    https://doi.org/10.1080/0907676X.2023.2268149 [Google Scholar]
  6. Chan, Clara Ho-yan
    (2013) From self-interpreting to real interpreting: A new web-based exercise to launch effective interpreting training. Perspectives, 21(3), 358–377. 10.1080/0907676X.2012.657654
    https://doi.org/10.1080/0907676X.2012.657654 [Google Scholar]
  7. Chen, Hua, Ying Wang and T. Pascal Brown
    (2021) The effects of topic familiarity on information completeness, fluency, and target language quality of student interpreters in Chinese–English consecutive interpreting. Across Languages and Cultures, 22(2), 176–191. 10.1556/084.2021.00013
    https://doi.org/10.1556/084.2021.00013 [Google Scholar]
  8. Chen, Sijia and Jan-Louis Kruger
    (2024) A computer-assisted consecutive interpreting workflow: Training and evaluation. The Interpreter and Translator Trainer, 18(3), 380–399. 10.1080/1750399X.2024.2373553
    https://doi.org/10.1080/1750399X.2024.2373553 [Google Scholar]
  9. (2023) The effectiveness of computer-assisted interpreting: A preliminary study based on English-Chinese consecutive interpreting. Translation and Interpreting Studies, 18(3), 399–420. 10.1075/tis.21036.che
    https://doi.org/10.1075/tis.21036.che [Google Scholar]
  10. Choi, Inn-Chull and Youngsun Moon
    (2020) Predicting the difficulty of EFL tests based on corpus linguistic features and expert judgment. Language Assessment Quarterly, 17(1), 18–42. 10.1080/15434303.2019.1674315
    https://doi.org/10.1080/15434303.2019.1674315 [Google Scholar]
  11. Defrancq, Bart and Claudio Fantinuoli
    (2021) Automatic speech recognition in the booth: Assessment of system performance, interpreters’ performances and interactions in the context of numbers. Target. International Journal of Translation Studies, 33(1), 73–102. 10.1075/target.19166.def
    https://doi.org/10.1075/target.19166.def [Google Scholar]
  12. Defrancq, Bart, Helena Snoeck and Claudio Fantinuoli
    (2024) Interpreters’ performances and cognitive load in the context of a CAI Tool. InMarion Winters, Sharon Deane-Cox & Ursula Böser (Eds.), Translation, Interpreting and Technological Change: Innovations in Research, Practice and Training (pp.37–58). Bloomsbury.
    [Google Scholar]
  13. Desmet, Bart, Mieke Vandierendonck and Bart Defrancq
    (2018) Simultaneous interpretation of numbers and the impact of technological support. InClaudio Fantinuoli (Ed.), Interpreting and Technology (pp.13–27). Language Science Press. 10.5281/zenodo.1493291
    https://doi.org/10.5281/zenodo.1493291 [Google Scholar]
  14. Fantinuoli, Claudio
    (2023) Towards AI-enhanced computer-assisted interpreting. InGloria Corpas Pastor & Bart Defrancq (Eds.), Interpreting Technologies — Current and Future Trends (pp.46–71). John Benjamins. 10.1075/ivitra.37.03fan
    https://doi.org/10.1075/ivitra.37.03fan [Google Scholar]
  15. Fantinuoli, Claudio, Giulia Marchesini, David Landan and Lukas Horak
    (2022) KUDO Interpreter Assist: Automated real-time support for remote interpretation. arXiv preprint, arXiv:2201.01800. 10.48550/arXiv.2201.01800
    https://doi.org/10.48550/arXiv.2201.01800 [Google Scholar]
  16. Fantinuoli, Claudio and Xiaoman Wang
    (2024, June24–27). Exploring the correlation between human and machine evaluation of simultaneous speech translation. Proceedings of the25th Annual Conference of the European Association for Machine Translation, Sheffield, UK.
    [Google Scholar]
  17. Frittella, Francesca
    (2021) Computer-assisted conference interpreter training: Limitations and future directions. Journal of Translation Studies, 1(2), 103–142. 10.3726/JTS022021.6
    https://doi.org/10.3726/JTS022021.6 [Google Scholar]
  18. Hou, Jue, Koppatz Maximilian, José María Hoya Quecedo, Nataliya Stoyanova and Roman Yangarber
    (2019, August2). Modeling language learning using specialized Elo rating. Proceedings of theFourteenth Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy.
    [Google Scholar]
  19. Hsu, Fu-Yuan, Hahn-Ming Lee, Tao-Hsing Chang and Yao-Ting Sung
    (2018) Automated estimation of item difficulty for multiple-choice tests: An application of word embedding techniques. Information Processing & Management, 54(6), 969–984.
    [Google Scholar]
  20. Hu, Kaibao and Qing Tao
    (2010) Hàn yīng huì yì kǒu yì yǔ liào kù de chuàng jiàn yǔ yìng yòng yán jiū [The compliation and application of Chinese-English conference interpreting corpus]. Chinese Translators Journal, 2010(5), 49–56, 95.
    [Google Scholar]
  21. Huang, Xiaojia and Chuanyun Bao
    (2016) Jiao ti chuan yi jiao xue cai liao nan du fen ji tan xi yi quan guo gao duan ying yong xing fan yi ren cai pei yang ji di jian she xiang mu wei li [Exploring the difficulty grading of teaching materials for consecutive interpreting — A case study on the construction of China’s high-end translation talent cultivation base]. Chinese Translators Journal, 2016(1), 58–62.
    [Google Scholar]
  22. Iglesias Fernández, Emilia
    (2016) Interactions between speaker’s speech rate, orality and emotional involvement, and perceptions of interpreting difficulty: A preliminary study. MonTI. Monografías de Traducción e Interpretación, Special Issue 3, 1–32. 10.6035/MonTI.2016.ne3.9
    https://doi.org/10.6035/MonTI.2016.ne3.9 [Google Scholar]
  23. Jayes, Thomas
    (2023) Conference interpreting and technology: An institutional perspective. InGloria Corpas Pastor & Bart Defrancq (Eds.), Interpreting Technologies — Current and Future Trends (pp.217–240). John Benjamins.
    [Google Scholar]
  24. Korpal, Paweł and Katarzyna Stachowiak-Szymczak
    (2020) Combined problem triggers in simultaneous interpreting: Exploring the effect of delivery rate on processing and rendering numbers. Perspectives, 28(1), 126–143. 10.1080/0907676X.2019.1628285
    https://doi.org/10.1080/0907676X.2019.1628285 [Google Scholar]
  25. Krüger, Ralph and Janiça Hackenbuchner
    (2024) A competence matrix for machine translation-oriented data literacy teaching. Target, 36(2), 245–275. 10.1075/target.22127.kru
    https://doi.org/10.1075/target.22127.kru [Google Scholar]
  26. Li, Xiaolong and Mengjie Wang
    (2018) Jī yú yǔ yīn shí bié APP de tóng shēng chuán yì néng lì péi yǎng jiào xué mó shì jiàn gòu yǔ yán jiū yǐ kē dà xùn fēi yǔ jì APP wèi lì [Construction and research of the teaching model of using automatic speech recognition app in simultaneous interpreting training course — A case study of voice note as an auxiliary tool]. Computer-Assisted Foreign Language Education in China, 11, 12–18.
    [Google Scholar]
  27. Liu, Yiguang and Junying Liang
    (2024) Multidimensional comparison of Chinese-English interpreting outputs from human and machine: Implications for interpreting education in the machine-translation age. Linguistics and Education, 801, 101273. 10.1016/j.linged.2024.101273
    https://doi.org/10.1016/j.linged.2024.101273 [Google Scholar]
  28. Loukina, Anastassia, Su-Youn Yoon, Jennifer Sakano, Youhua Wei and Kathy Sheehan
    (2016, December13–16). Textual complexity as a predictor of difficulty of listening items in language proficiency tests. Proceedings ofCOLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan.
    [Google Scholar]
  29. Lu, Xinchao
    (2022) Yi yuan yu ji qi han ying tong sheng chuan yi zhi liang he guo cheng dui bi yan jiu [Comparing the quality and processes of Chinese-English simultaneous interpreting by interpreters and a machine]. Foreign Language Teaching and Research, 2022(4), 600–610, 641.
    [Google Scholar]
  30. Mazzei, Cristiano and Laurence Jay-Rayon Ibrahim Aibo
    (2022) The Routledge Guide to Teaching Translation and Interpreting Online. Routledge.
    [Google Scholar]
  31. Pisani, Elisabetta and Claudio Fantinuoli
    (2021) Measuring the impact of automatic speech recognition on number rendition in simultaneous interpreting. InCaiwan Wang & Binghan Zheng (Eds.), Empirical Studies of Translation and Interpreting (pp.181–197). Routledge.
    [Google Scholar]
  32. Prandi, Bianca
    (2023) Computer-assisted Simultaneous Interpreting: A Cognitive-Experimental Study on Terminology. Language Science Press.
    [Google Scholar]
  33. Rust, John and Susan Golombok
    (2014) Modern Psychometrics: The Science of Psychological Assessment. Routledge.
    [Google Scholar]
  34. Seiffe, Laura, Fares Kallel, Sebastian Möller, Babak Naderi and Roland Roller
    (2022, June20–25). Subjective text complexity assessment for German. Proceedings of theThirteenth Language Resources and Evaluation Conference, Marseille, France.
    [Google Scholar]
  35. Seleskovitch, Danica
    (1975) Langage, langues et mémoire: étude de la prise de notes en interprétation consécutive [Language, languages, and memory: A study of note-taking in consecutive interpretation]. Minard.
    [Google Scholar]
  36. Setton, Robin and Andrew Dawrant
    (2016) Conference Interpreting: A Trainer’s Guide. John Benjamins.
    [Google Scholar]
  37. Susanti, Yuni, Takenobu Tokunaga, Hitoshi Nishikawa and Hiroyuki Obari
    (2017) Controlling item difficulty for automatic vocabulary question generation. Research and Practice in Technology Enhanced Learning, 12(25), 1–16. 10.1186/s41039‑017‑0065‑5
    https://doi.org/10.1186/s41039-017-0065-5 [Google Scholar]
  38. Tamor, Lynne
    (1981) Subjective text difficulty: An alternative approach to defining the difficulty level of written text. Journal of Reading Behavior, 13(2), 165–172. 10.1080/10862968109547404
    https://doi.org/10.1080/10862968109547404 [Google Scholar]
  39. Tymczyńska, Maria
    (2009) Integrating in-class and online learning activities in a healthcare interpreting course using Moodle. The Journal of Specialised Translation, 121, 148–164.
    [Google Scholar]
  40. Venkatesan, Hari
    (2023) Technology preparedness and translator training: Implications for curricula. Babel, 69(5), 666–703. 10.1075/babel.00335.ven
    https://doi.org/10.1075/babel.00335.ven [Google Scholar]
  41. Venugopal, Vinu E. and P. Sreenivasa Kumar
    (2020) Difficulty-level modeling of ontology-based factual questions. Semantic Web, 11(6), 1023–1036. 10.3233/SW‑200381
    https://doi.org/10.3233/SW-200381 [Google Scholar]
  42. Wang, Xiaoman and Lu Yuan
    (2023) Machine-learning based automatic assessment of communication in interpreting. Frontiers in Communication, 81, 1047753. 10.3389/fcomm.2023.1047753
    https://doi.org/10.3389/fcomm.2023.1047753 [Google Scholar]
  43. Yaneva, Victoria, Peter Baldwin and Janet Mee
    (2019, August2). Predicting the difficulty of multiple choice questions in a high-stakes medical exam. Proceedings of theFourteenth Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy.
    [Google Scholar]
/content/journals/10.54754/incontext.v5i1.104
Loading
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error