Volume 35, Issue 1
  • ISSN 0924-1884
  • E-ISSN: 1569-9986
Buy:$35.00 + Taxes



Item-based scoring has been advocated as a psychometrically robust approach to translation quality assessment, outperforming traditional neo-hermeneutic and error analysis methods. The past decade has witnessed a succession of item-based scoring methods being developed and trialed, ranging from calibration of dichotomous items to preselected item evaluation. Despite this progress, these methods seem to be undermined by several limitations, such as the inability to accommodate the multifaceted reality of translation quality assessment and inconsistent item calibration procedures. Against this background, we conducted a methodological exploration, utilizing what we call an , to measure translation quality. This new method, built on the sophisticated psychometric model of many-facet Rasch measurement, inherits the item concept from its predecessors, but addresses previous limitations. In this article, we demonstrate its operationalization and provide an initial body of empirical evidence supporting its reliability, validity, and utility, as well as discuss its potential applications.


Article metrics loading...

Loading full text...

Full text loading...


  1. Angelelli, Claudia V.
    2009 “Using a Rubric to Assess Translation Ability: Defining the Construct.” In Testing and Assessment in Translation and Interpreting Studies: A Call for Dialogue between Research and Practice, edited byClaudia V. Angelelli and Holly E. Jacobson, 13–47. Amsterdam: John Benjamins. 10.1075/ata.xiv.03ang
    https://doi.org/10.1075/ata.xiv.03ang [Google Scholar]
  2. Bachman, Lyle F.
    2004Statistical Analyses for Language Assessment. Cambridge: Cambridge University Press. 10.1017/CBO9780511667350
    https://doi.org/10.1017/CBO9780511667350 [Google Scholar]
  3. Bond, Trevor G., and Christine M. Fox
    2015Applying the Rasch Model: Fundamental Measurement in the Human Sciences. 3rd ed.New York: Routledge. 10.4324/9781315814698
    https://doi.org/10.4324/9781315814698 [Google Scholar]
  4. Campbell, Stuart J.
    1991 “Towards a Model of Translation Competence.” Meta36 (2–3): 329–343. 10.7202/002190ar
    https://doi.org/10.7202/002190ar [Google Scholar]
  5. Colina, Sonia
    2008 “Translation Quality Evaluation: Some Empirical Evidence for a Functionalist Approach.” The Translator14 (1): 97–134. 10.1080/13556509.2008.10799251
    https://doi.org/10.1080/13556509.2008.10799251 [Google Scholar]
  6. 2009 “Further Evidence for a Functionalist Approach to Translation Quality Evaluation.” Target21 (2): 235–264. 10.1075/target.21.2.02col
    https://doi.org/10.1075/target.21.2.02col [Google Scholar]
  7. Eckes, Thomas
    2015Introduction to Many-Facet Rasch Measurement: Analyzing and Evaluating Rater-Mediated Assessments. 2nd ed.Frankfurt am Main: Peter Lang.
    [Google Scholar]
  8. Eyckmans, June, and Philippe Anckaert
    2017 “Item-based Assessment of Translation Competence: Chimera of Objectivity Versus Prospect of Reliable Measurement.” In Translator Quality – Translation Quality: Empirical Approaches to Assessment and Evaluation, edited byGeoffrey S. Koby and Isabel Lacruz, special issue ofLinguistica Antverpiensia161: 40–56.
    [Google Scholar]
  9. Eyckmans, June, Philippe Anckaert, and Winibert Segers
    2009 “The Perks of Norm Referenced Translation Evaluation.” In Testing and Assessment in Translation and Interpreting Studies: A Call for Dialogue between Research and Practice, edited byClaudia V. Angelelli and Holly E. Jacobson, 73–93. Amsterdam: John Benjamins. 10.1075/ata.xiv.06eyc
    https://doi.org/10.1075/ata.xiv.06eyc [Google Scholar]
  10. Eyckmans, June, Winibert Segers, and Philippe Anckaert
    2012 “Translation Assessment Methodology and the Prospects of European Collaboration.” In Collaboration in Language Testing and Assessment, edited byDina Tsagari and Ildikó Csépes, 171–184. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  11. Green, Rita
    2013Statistical Analyses for Language Testers. Basingstoke: Palgrave Macmillan. 10.1057/9781137018298
    https://doi.org/10.1057/9781137018298 [Google Scholar]
  12. Han, Chao
    2015 “Investigating Rater Severity/Leniency in Interpreter Performance Testing: A Multifaceted Rasch Measurement Approach.” Interpreting17 (2): 255–283. 10.1075/intp.17.2.05han
    https://doi.org/10.1075/intp.17.2.05han [Google Scholar]
  13. 2016 “Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach.” Language Assessment Quarterly13 (3): 186–201. 10.1080/15434303.2016.1211132
    https://doi.org/10.1080/15434303.2016.1211132 [Google Scholar]
  14. 2017 “Using Analytic Rating Scales to Assess English/Chinese Bi-directional Interpretation: A Longitudinal Rasch Analysis of Scale Utility and Rater Behavior.” In Translator Quality – Translation Quality: Empirical Approaches to Assessment and Evaluation, edited byGeoffrey S. Koby and Isabel Lacruz, special issue ofLinguistica Antverpiensia161: 196–215.
    [Google Scholar]
  15. 2019 “A Generalizability Theory Study of Optimal Measurement Design for a Summative Assessment of English/Chinese Consecutive Interpreting.” Language Testing36 (3): 419–438. 10.1177/0265532218809396
    https://doi.org/10.1177/0265532218809396 [Google Scholar]
  16. 2020 “Translation Quality Assessment: A Critical Methodological Review.” The Translator26 (3): 257–273. 10.1080/13556509.2020.1834751
    https://doi.org/10.1080/13556509.2020.1834751 [Google Scholar]
  17. Han, Chao, Rui Xiao, and Wei Su
    2021 “Assessing the Fidelity of Consecutive Interpreting: The Effects of Using Source Versus Target Text as the Reference Material.” Interpreting23 (2): 245–268. 10.1075/intp.00058.han
    https://doi.org/10.1075/intp.00058.han [Google Scholar]
  18. House, Juliane
    2015Translation Quality Assessment: Past and Present. Abingdon: Routledge.
    [Google Scholar]
  19. IBM Corp
    IBM Corp 2012IBM SPSS Statistics for Windows. V. 21.0. Armonk, NY: IBM Corp.
    [Google Scholar]
  20. Kockaert, Hendrik J., and Winibert Segers
    2012 “L’assurance qualité des traductions: items sélectionnés et évaluation assistée par ordinateur [Quality assurance of translations: Selected items and computer-assisted evaluation].” Meta57 (1): 159–176. 10.7202/1012747ar
    https://doi.org/10.7202/1012747ar [Google Scholar]
  21. 2017 “Evaluation of Legal Translations: PIE Method (Preselected Items Evaluation).” JoSTrans271: 148–163.
    [Google Scholar]
  22. Lauscher, Susanne
    2000 “Translation Quality Assessment: Where Can Theory and Practice Meet?” The Translator6 (2): 149–168. 10.1080/13556509.2000.10799063
    https://doi.org/10.1080/13556509.2000.10799063 [Google Scholar]
  23. Linacre, John M.
    1989Many-Facet Rasch Measurement. Chicago: MESA Press.
    [Google Scholar]
  24. 1999 “Investigating Rating Scale Category Utility.” Journal of Outcome Measurement3 (2): 103–122.
    [Google Scholar]
  25. 2002 “What Do Infit and Outfit, Mean-Square and Standardized Mean?” Rasch Measurement Transactions16 (2): 878.
    [Google Scholar]
  26. 2017FACETS: Computer Program for Many Faceted Rasch Measurement. V. 3.80.0. Beaverton, OR: Winsteps.
    [Google Scholar]
  27. Martínez Mateo, Robert
    2014 “A Deeper Look into Metrics for Translation Quality Assessment (TQA): A Case Study.” Miscelanea491: 73–93.
    [Google Scholar]
  28. McAlester, Gerard
    2000 “The Evaluation of Translation into a Foreign Language.” In Developing Translation Competence, edited byChristina Schäffner and Beverly Adab, 229–241. Amsterdam: John Benjamins. 10.1075/btl.38.21mca
    https://doi.org/10.1075/btl.38.21mca [Google Scholar]
  29. Myford, Carol M., and Edward W. Wolfe
    2003 “Detecting and Measuring Rater Effects Using Many-Facet Rasch Measurement: Part I.” Journal of Applied Measurement4 (4): 386–422.
    [Google Scholar]
  30. O’Brien, Sharon
    2012 “Towards a Dynamic Quality Evaluation Model for Translation.” JoSTrans171: 55–77.
    [Google Scholar]
  31. Pym, Anthony
    1992 “Translation Error Analysis and the Interface with Language Teaching.” In Teaching Translation and Interpreting: Training, Talent and Experience. Papers from the First Language International Conference, Elsinore, Denmark, 1991, edited byCay Dollerup and Anne Loddegaard, 279–288. Amsterdam: John Benjamins. 10.1075/z.56.42pym
    https://doi.org/10.1075/z.56.42pym [Google Scholar]
  32. Rasch, Georg
    1980Probabilistic Models for Some Intelligence and Attainment Tests. Chicago: MESA Press.
    [Google Scholar]
  33. Teague, Ben
    1987 “ATA Accreditation and Excellence in Practice.” In Translation Excellence: Assessment, Achievement, Maintenance, edited byMarilyn Gaddis Rose, 21–26. Amsterdam: John Benjamins.
    [Google Scholar]
  34. Turner, Barry, Miranda Lai, and Neng Huang
    2010 “Error Deduction and Descriptors – A Comparison of Two Methods of Translation Test Assessment.” Translation & Interpreting2 (1): 11–23.
    [Google Scholar]
  35. Waddington, Christopher
    2001 “Should Translations Be Assessed Holistically or Through Error Analysis?” Hermes261: 15–38.
    [Google Scholar]
  36. Williams, Malcolm
    1989 “The Assessment of Professional Translation Quality: Creating Credibility out of Chaos.” TTR2 (2): 13–33. 10.7202/037044ar
    https://doi.org/10.7202/037044ar [Google Scholar]
  37. Wind, Stefanie A., and Meghan E. Peterson
    2018 “A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment.” Language Testing35 (2): 161–192. 10.1177/0265532216686999
    https://doi.org/10.1177/0265532216686999 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error