First steps in assigning proficiency to texts in a learner corpus of computer-mediated communication
This chapter presents a new method for assigning proficiency levels to texts in alearner corpus of computer-mediated communication (CMC). The CMC comesfrom learner comments on news articles that form part of an English languagecourse for university students in Japan. The rationale for using the CMC discourseas the basis of a learner corpus will be discussed, followed by a justificationof using a text-centred approach of assigning proficiency. The use of binarydecision trees to account for the complexity, accuracy and fluency evident inthe texts will be described, followed by a snapshot of the results from using themethod so far. The chapter concludes with the suggestion that while some of thedetails may need refining, in principle the method could be of use in categorizingthe proficiency of texts in other learner corpora.