Volume 36, Issue 2
  • ISSN 0213-2028
  • E-ISSN: 2254-6774
Buy:$35.00 + Taxes



Bilingual raters play an important role in assessing spoken-language interpreting (between X and Y languages). Presumably, raters with X being the dominant language (DL) and Y the less DL can potentially differ, in terms of rating processes, from other raters with Y being the DL and X the less DL, when assessing either X-to-Y or Y-to-X interpreting. As such, raters’ language background and its interaction with interpreting directionality may influence assessment outcomes. However, this complex interaction and its effects on assessment have not been investigated. We therefore conducted the current experiment to explore how raters’ language background and interpreting directionality would affect assessment of English-Chinese, two-way interpreting. Our analyses of the quantitative data indicate that, when assessing interpreting into raters’ mother tongue or DL, they displayed a greater level of self-confidence and self-consistency, but rated performance more harshly. Such statistically significant group-level disparities led to different assessment outcomes, as pass and fail rates varied, depending on the rater group. These quantitative findings, coupled with the raters’ qualitative comments, may have implications for selection and training of bilingual raters for interpreting assessment.


Article metrics loading...

Loading full text...

Full text loading...


  1. Brown, A.
    (1995) The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 121, 1–15. 10.1177/026553229501200101
    https://doi.org/10.1177/026553229501200101 [Google Scholar]
  2. Carey, M. D., Mannell, R. H., & Dunn, P. K.
    (2011) Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews?Language Testing, 281, 201–219. 10.1177/0265532210393704
    https://doi.org/10.1177/0265532210393704 [Google Scholar]
  3. Chen, J.
    (2009) Authenticity in accreditation tests for interpreters in China. The Interpreter and Translator Trainer, 31, 257–273. 10.1080/1750399X.2009.10798791
    https://doi.org/10.1080/1750399X.2009.10798791 [Google Scholar]
  4. Fayer, J. M., & Krasinski, E.
    (1987) Native and nonnative judgments of intelligibility and irritation. Language Learning, 371, 313–326. 10.1111/j.1467‑1770.1987.tb00573.x
    https://doi.org/10.1111/j.1467-1770.1987.tb00573.x [Google Scholar]
  5. Gile, D.
    (2009) Interpreting studies: A critical review from within. Monografías De Traducción E Interpretación, 11, 135–155. 10.6035/MonTI.2009.1.6
    https://doi.org/10.6035/MonTI.2009.1.6 [Google Scholar]
  6. Gui, M.
    (2012) Exploring differences between Chinese and American EFL teachers’ evaluation of speech performance. Language Assessment Quarterly, 91, 186–203. 10.1080/15434303.2011.614030
    https://doi.org/10.1080/15434303.2011.614030 [Google Scholar]
  7. Han, C.
    (2017) Using analytic rating scales to assess English-Chinese bi-directional interpreting: A longitudinal Rasch analysis of scale utility and rater behaviour. Linguistica Antverpiensia, New Series: Themes in Translation Studies, 161, 196–215. 10.52034/lanstts.v16i0.429
    https://doi.org/10.52034/lanstts.v16i0.429 [Google Scholar]
  8. (2018) A longitudinal quantitative investigation into the concurrent validity of self and peer assessment applied to English-Chinese bi-directional interpretation in an undergraduate interpreting course. Studies in Educational Evaluation, 581, 187–196. 10.1016/j.stueduc.2018.01.001
    https://doi.org/10.1016/j.stueduc.2018.01.001 [Google Scholar]
  9. (2019) A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing, 361, 419–438. 10.1177/0265532218809396
    https://doi.org/10.1177/0265532218809396 [Google Scholar]
  10. (2022) Interpreting testing and assessment: A state-of-the-art review. Language Testing, 391, 30–55. 10.1177/02655322211036100
    https://doi.org/10.1177/02655322211036100 [Google Scholar]
  11. Han, C., & Riazi, M.
    (2018) The accuracy of student self-assessments of English-Chinese bidirectional interpretation: A longitudinal quantitative study. Assessment and Evaluation in Higher Education, 431, 386–398. 10.1080/02602938.2017.1353062
    https://doi.org/10.1080/02602938.2017.1353062 [Google Scholar]
  12. Han, C., Xiao, R., & Su, W.
    (2021) Assessing the fidelity of consecutive interpretation: The effects of using source versus target text as the reference material. Interpreting, 231, 245–268. 10.1075/intp.00058.han
    https://doi.org/10.1075/intp.00058.han [Google Scholar]
  13. Hill, K.
    (1996) Who should be the judge? The use of non-native speakers as raters on a test of English as an international language. Melbourne Papers in Language Testing, 51, 29–49.
    [Google Scholar]
  14. Huang, B. H.
    (2013) The effects of accent familiarity and language teaching experience on raters’ judgments of non-native speech. System, 411, 770–785. 10.1016/j.system.2013.07.009
    https://doi.org/10.1016/j.system.2013.07.009 [Google Scholar]
  15. Huang, B., Alegre, A., & Eisenberg, A.
    (2016) A cross-linguistic investigation of the effect of raters’ accent familiarity on speaking assessment. Language Assessment Quarterly, 131, 25–41. 10.1080/15434303.2015.1134540
    https://doi.org/10.1080/15434303.2015.1134540 [Google Scholar]
  16. Kim, Y-H.
    (2009a) A G-theory analysis of rater effect in ESL speaking assessment. Applied Linguistics, 301, 435–40. 10.1093/applin/amp035
    https://doi.org/10.1093/applin/amp035 [Google Scholar]
  17. (2009b) An investigation into native and non-native teachers’ judgments of oral English performance: a mixed methods approach. Language Testing, 261, 187–217. 10.1177/0265532208101010
    https://doi.org/10.1177/0265532208101010 [Google Scholar]
  18. Linacre, J. M.
    (2002) What do infit and outfit, mean-square and standardized mean?Rasch Measurement Transactions, 161, 878. https://www.rasch.org/rmt/rmt162f.htm
    [Google Scholar]
  19. Liu, M-H.
    (2011) Methodology in interpreting studies: A methodological review of evidence-based research. InB. Nicodemus & L. Swabey (Eds.), Advances in interpreting research: Inquiry in action (pp.85–120). John Benjamins. 10.1075/btl.99.08liu
    https://doi.org/10.1075/btl.99.08liu [Google Scholar]
  20. (2013) Design and analysis of Taiwan’s interpretation certification examination. InD. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp.163–178). Peter Lang.
    [Google Scholar]
  21. Liu, M-H., Chang, C-C., & Wu, S -C.
    (2008) Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review, 11, 1–42.
    [Google Scholar]
  22. Mellinger, C. D., & Hanson, T. A.
    (2017) Quantitative research methods in translation and interpreting studies. Routledge.
    [Google Scholar]
  23. Sawyer, D. B.
    (2004) Fundamental aspects of interpreter education: Curriculum and assessment. John Benjamins. 10.1075/btl.47
    https://doi.org/10.1075/btl.47 [Google Scholar]
  24. Seleskovitch, D., & Lederer, M.
    (1989) Pédagogie raisonnée de l’interprétation. Didier Érudition.
    [Google Scholar]
  25. Setton, R., & Dawrant, A.
    (2016) Conference interpreting: A trainer’s guide. John Benjamins. 10.1075/btl.121
    https://doi.org/10.1075/btl.121 [Google Scholar]
  26. Skaaden, H.
    (2013) Assessing interpreter aptitude in a variety of languages. InD. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp.35–50). Peter Lang.
    [Google Scholar]
  27. Su, W.
    (2019) Exploring native English teachers’ and native Chinese teachers’ assessment of interpreting. Language and Education, 331, 577–594. 10.1080/09500782.2019.1596121
    https://doi.org/10.1080/09500782.2019.1596121 [Google Scholar]
  28. Wei, J., & Llosa, L.
    (2015) Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Language Assessment Quarterly, 121, 283–304. 10.1080/15434303.2015.1037446
    https://doi.org/10.1080/15434303.2015.1037446 [Google Scholar]
  29. Wink, P., Gass, S., & Myford, C.
    (2012) Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 301, 231–252. 10.1177/0265532212456968
    https://doi.org/10.1177/0265532212456968 [Google Scholar]
  30. Xi, X., & Mollaun, P.
    (2011) Using raters from India to score a large-scale speaking test. Language Learning, 611, 1222–1255. 10.1111/j.1467‑9922.2011.00667.x
    https://doi.org/10.1111/j.1467-9922.2011.00667.x [Google Scholar]
  31. Yeh, S-P., & Liu, M.
    (2006) A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics. Journal of the National Institute for Compilation and Translation, 341, 57–78.
    [Google Scholar]
  32. Zhang, Y., & Elder, C.
    (2011) Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs?Language Testing, 281, 31–50. 10.1177/0265532209360671
    https://doi.org/10.1177/0265532209360671 [Google Scholar]
  33. (2014) Investigating native and non-native English-speaking teacher raters’ judgements of oral proficiency in the College English Test-Spoken English Test (CETSET). Assessment in Education: Principles, Policy & Practice, 211, 306–325. 10.1080/0969594X.2013.845547
    https://doi.org/10.1080/0969594X.2013.845547 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error