Full text loading...
-
Patterns of rater behaviour in the assessment of an oral interaction test
- Source: Australian Review of Applied Linguistics, Volume 17, Issue 2, Jan 1994, p. 77 - 103
Abstract
Lack of inter-rater agreement in the assessment of oral tests is wellknown. In this paper, multi-faceted Rasch analysis was used to determine whether any bias was evident in the way a group of raters (N=13) rated two different versions of an oral interaction test, undertaken by the same candidates (N=83) under the two conditions – direct and semi-direct. Rasch measurement allows analysis of the interaction between ‘facets’; in this case, raters, items and candidates are all facets. In this study, the interaction between rater and item was investigated in order to determine whether particular tasks in the test were scored in a consistently biased way by particular raters. The results of the analysis indicated that certain raters consistently assessed the tape-version of the test more harshly whilst others consistently rated the live version more harshly. This type of approach also allowed a finer analysis at the level of individual items with respect to harshness and consistency across ratings. The implications for rater training and feedback are discussed.