The <i>English Vocabulary Profile</i> as a benchmark for assigning levels to learner corpus data
This study explores the use of the English Vocabulary Profile (EVP) for theassignment of relevant proficiency bands to learner production samples. Thevocabulary of 90 essays drawn from the International Corpus of CrosslinguisticInterlanguage (ICCI) has been tagged with the corresponding CommonEuropean Framework of Reference (CEFR) levels according to the informationavailable in the EVP database. Cluster analysis was performed in orderto classify the essays into five groups, which were later rank-ordered based ontheir length and lexical characteristics. In addition, the same 90 essays wererated on the CEFR scale by three raters. Finally, the five clusters were correlatedwith their rater-assigned levels with the help of a measure of rank correlation(Goodman and Kruskel’s gamma). The results demonstrate a strong associationbetween the statistically-established clusters of essays and their CEFR scores.