Volume 23, Issue 2
  • ISSN 1384-6647
  • E-ISSN: 1569-982X
Buy:$35.00 + Taxes



The study reported on in this article pertains to rater-mediated assessment of English-to-Chinese consecutive interpreting, particularly informational correspondence between an originally intended message and an actually rendered message, also known as “fidelity” in Interpreting Studies. Previous literature has documented two main methods to assess fidelity: comparing actual renditions with the source text or with an exemplar rendition carefully prepared by experts (i.e., an ideal target text). However, little is known about the potential effects of these methods on fidelity assessment. We therefore conducted the study to explore the way in which these methods would affect rater reliability, fidelity ratings and rater perception. Our analysis of quantitative data shows that the raters tended to be less reliable, less self-consistent, less lenient and less comfortable when using the source English text (i.e., Condition A) than when using the target Chinese text (i.e., Condition B: the exemplar rendition). These findings were backed up and explained by emerging themes derived from the qualitative questionnaire data. The fidelity estimates in the two conditions were also found to be strongly correlated. We discuss these findings and entertain the possibility of recruiting untrained monolinguals or bilinguals to assess fidelity of interpreting.


Article metrics loading...

Loading full text...

Full text loading...


  1. Angelelli, C. & Jacobson, H. E.
    (Eds.) (2009) Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins. doi:  10.1075/ata.xiv
    https://doi.org/10.1075/ata.xiv [Google Scholar]
  2. Barik, H. C.
    (1975) Simultaneous interpretation: Temporal and quantitative data. Language and Speech16 (3), 237–270. doi:  10.1177/002383097301600307
    https://doi.org/10.1177/002383097301600307 [Google Scholar]
  3. Bühler, H.
    (1986) Linguistic (semantic) and extra-linguistic (pragmatic) criteria for the evaluation of conference interpretation and interpreters. Mutlilingua5 (4), 231–235.
    [Google Scholar]
  4. Campbell, S. & Hale, S.
    (2003) Translation and interpreting assessment in the context of educational measurement. InG. Anderman & M. Rogers (Eds.), Translation today: Trends and perspectives. Clevedon: Multilingual Matters, 205–224. 10.21832/9781853596179‑017
    https://doi.org/10.21832/9781853596179-017 [Google Scholar]
  5. Carroll, J. B.
    (1966) An experiment in evaluating the quality of translations. Mechanical Translation and Computational Linguistics9 (3–4), 55–66.
    [Google Scholar]
  6. Chesterman, A.
    (2016) Memes of translation: The spread of ideas in translation theory (revised edition). Amsterdam: John Benjamins. doi:  10.1075/btl.22
    https://doi.org/10.1075/btl.22 [Google Scholar]
  7. Coughlin, D.
    (2003) Correlating automated and human assessments of machine translation quality. Retrieved from www.mt-archive.info/MTS-2003-Coughlin.pdf
    [Google Scholar]
  8. Eckes, T.
    (2015) Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  9. Gerver, D.
    (1969/2002) The effects of source language presentation rate on the performance of simultaneous conference interpreters. InF. Pöchhacker & M. Shlesinger (Eds.), The interpreting studies reader. London: Routledge, 53–66.
    [Google Scholar]
  10. Gile, D.
    (1995) Fidelity assessment in consecutive interpretation: An experiment. Target7 (1), 151–164. doi:  10.1075/target.7.1.12gil
    https://doi.org/10.1075/target.7.1.12gil [Google Scholar]
  11. (1999) Variability in the perception of fidelity in simultaneous interpretation. Hermes22, 51–79. doi:  10.7146/hjlcb.v12i22.25493
    https://doi.org/10.7146/hjlcb.v12i22.25493 [Google Scholar]
  12. (2009) Interpreting studies: A critical view from within. MonTI1, 135–155. https://rua.ua.es/dspace/bitstream/10045/13040/1/MonTI_01_11.pdf. 10.6035/MonTI.2009.1.6
    https://doi.org/10.6035/MonTI.2009.1.6 [Google Scholar]
  13. Hamidi, M. & Pöchhacker, F.
    (2007) Simultaneous consecutive interpreting: A new technique put to the test. Meta52 (2), 276–289. doi:  10.7202/016070ar
    https://doi.org/10.7202/016070ar [Google Scholar]
  14. Han, C.
    (2015) Investigating rater severity/leniency in interpreter performance testing: A multifaceted Rasch measurement approach. Interpreting17 (2), 255–283. doi:  10.1075/intp.17.2.05han
    https://doi.org/10.1075/intp.17.2.05han [Google Scholar]
  15. (2016) Investigating score dependability in English/Chinese interpreter certification performance testing: A generalizability theory approach. Language Assessment Quarterly13 (3), 186–201. doi:  10.1080/15434303.2016.1211132
    https://doi.org/10.1080/15434303.2016.1211132 [Google Scholar]
  16. (2017) Using analytic rating scales to assess English/Chinese bidirectional interpretation: A longitudinal Rasch analysis of scale utility and rater behavior. Linguistica Antverpiensia New Series – Themes in Translation Studies16, 196–215.
    [Google Scholar]
  17. (2018a) Using rating scales to assess interpretation: Practices, problems and prospects. Interpreting20 (1), 59–95. doi:  10.1075/intp.00003.han
    https://doi.org/10.1075/intp.00003.han [Google Scholar]
  18. (2018b) Latent trait modelling of rater accuracy in formative peer assessment of English–Chinese consecutive interpreting. Assessment & Evaluation in Higher Education43 (6), 979–994. doi:  10.1080/02602938.2018.1424799
    https://doi.org/10.1080/02602938.2018.1424799 [Google Scholar]
  19. (2019) A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing36(3), 419–438. doi:  10.1177/0265532218809396
    https://doi.org/10.1177/0265532218809396 [Google Scholar]
  20. Hlavac, J.
    (2013) A cross-national overview of translator and interpreter certification procedures. Translation & Interpreting5, 32–65. doi:  10.12807/ti.105201.2013.a02
    https://doi.org/10.12807/ti.105201.2013.a02 [Google Scholar]
  21. Lee, J.
    (2008) Rating scales for interpreting performance assessment. The Interpreter and Translator Trainer2 (2), 165–184. doi:  10.1080/1750399X.2008.10798772
    https://doi.org/10.1080/1750399X.2008.10798772 [Google Scholar]
  22. Lee, S-B.
    (2015) Developing an analytic scale for assessing undergraduate students’ consecutive interpreting performances. Interpreting17 (2), 226–254. doi:  10.1075/intp.17.2.04lee
    https://doi.org/10.1075/intp.17.2.04lee [Google Scholar]
  23. (2019) Holistic assessment of consecutive interpretation: How interpreter trainers rate student performance. Interpreting21 (2), 245–269. doi:  10.1075/intp.00029.lee
    https://doi.org/10.1075/intp.00029.lee [Google Scholar]
  24. Lee, T-H.
    (1999) Simultaneous listening and speaking in English into Korean simultaneous interpretation. Meta44 (1), 560–572. doi:  10.7202/003444ar
    https://doi.org/10.7202/003444ar [Google Scholar]
  25. Liu, M-H.
    (2004) Working memory and expertise in simultaneous interpreting. Interpreting6 (1), 19–42. doi:  10.1075/intp.6.1.04liu
    https://doi.org/10.1075/intp.6.1.04liu [Google Scholar]
  26. (2013) Design and analysis of Taiwan’s interpretation certification examination. InD. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 163–178.
    [Google Scholar]
  27. Liu, M-H., Chang, C-C. & Wu, S-C.
    (2008) Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review1 (1), 1–42.
    [Google Scholar]
  28. Liu, M-H. & Chiu, Y-H.
    (2009) Assessing source material difficulty for consecutive interpreting: Quantifiable measures and holistic judgment. Interpreting11 (2), 244–266. doi:  10.1075/intp.11.2.07liu
    https://doi.org/10.1075/intp.11.2.07liu [Google Scholar]
  29. Meuleman, C. & Van Besien, F.
    (2009) Coping with extreme speech conditions in simultaneous interpreting. Interpreting11 (1), 20–34. doi:  10.1075/intp.11.1.03meu
    https://doi.org/10.1075/intp.11.1.03meu [Google Scholar]
  30. Myford, C. M. & Wolfe, E. W.
    (2003) Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement4 (4), 386–422.
    [Google Scholar]
  31. Pöchhacker, F.
    (2004) Introducing interpreting studies. London: Routledge. 10.4324/9780203504802
    https://doi.org/10.4324/9780203504802 [Google Scholar]
  32. Sawyer, D. B.
    (2004) Fundamental aspects of interpreter education: Curriculum and assessment. Amsterdam: John Benjamins. doi:  10.1075/btl.47
    https://doi.org/10.1075/btl.47 [Google Scholar]
  33. Setton, R. & Dawrant, A.
    (2016) Conference interpreting: A trainer’s guide. Amsterdam: John Benjamins. doi:  10.1075/btl.121
    https://doi.org/10.1075/btl.121 [Google Scholar]
  34. Setton, R. & Motta, M.
    (2007) Syntacrobatics: Quality and reformulation in simultaneous-with-text. Interpreting9 (2), 199–230. doi:  10.1075/intp.9.2.04set
    https://doi.org/10.1075/intp.9.2.04set [Google Scholar]
  35. Skaaden, H.
    (2013) Assessing interpreter aptitude in a variety of languages. InD. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 35–50.
    [Google Scholar]
  36. Stemler, S. E. & Tsai, J.
    (2008) Best practices in estimating interrater reliability: Three common approaches. InJ. Osborne (Ed.), Best practices in quantitative methods. Thousand Oaks, CA: Sage, 29–49. doi:  10.4135/9781412995627.d5
    https://doi.org/10.4135/9781412995627.d5 [Google Scholar]
  37. Tiselius, E.
    (2009) Revisiting Carroll’s scales. InC. V. Angelelli & H. E. Jacobson (Eds.), Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins, 95–121. doi:  10.1075/ata.xiv.07tis
    https://doi.org/10.1075/ata.xiv.07tis [Google Scholar]
  38. Tommola, J. & Helevä, M.
    (1998) Language direction and source text complexity: Effects on trainee performance in simultaneous interpreting. InL. Bowker, M. Cronin, D. Kenny & J. Pearson (Eds.), Unity in diversity? Current trends in translation studies. Manchester: St Jerome, 177–186.
    [Google Scholar]
  39. Vermeiren, H., Gucht, J. V. & De Bontridder, L.
    (2009) Standards as critical success factors in assessments: Certifying social interpreters in Flanders, Belgium. InC. V. Angelelli & H. E. Jacobson (Eds.), Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins, 291–330. doi:  10.1075/ata.xiv.14ver
    https://doi.org/10.1075/ata.xiv.14ver [Google Scholar]
  40. Wang, W-W., Xu, Y., Wang, B-H. & Mu, L.
    (2020) Developing interpreting competence scales in China. Frontiers in Psychology11, 481. doi:  10.3389/fpsyg.2020.00481
    https://doi.org/10.3389/fpsyg.2020.00481 [Google Scholar]
  41. Wu, J., Liu, M. & Liao, C.
    (2013) Analytic scoring in interpretation test: Construct validity and the halo effect. InH-H. Liao, T-E. Kao & Y. Lin (Eds.), The making of a translator: Multiple perspectives. Taipei: Bookman, 277–292.
    [Google Scholar]
  42. Wu, S. C.
    (2010) Assessing simultaneous interpreting: A study on test reliability and examiners’ assessment behavior. PhD thesis, Newcastle University.
    [Google Scholar]
  43. Yeh, S.-P. & Liu, M.
    (2006) A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics. Compilation and Translation Review34 (4), 57–78.
    [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): consecutive interpreting; fidelity; rater-mediated assessment; source text; target text
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error