The ALeSKo learner corpus
Design – annotation – quantitative analyses
- Author(s): Heike Zinsmeister and Margit Breckle
- Source: Multilingual Corpora and Multilingual Corpus Analysis , pp 71-96
- Publication Date November 2012
The ALesKo learner corpus is a small-scale comparable corpus consisting of two subcorpora: annotated essays by advanced Chinese learners of German and comparable essays by German native speakers. The motivation for its compilation was the investigation of discourse-related phenomena such as local coherence in second-language acquisition of German. After introducing how the texts were compiled and annotated, the article focuses on quantitative studies at the token level. We discuss problems of tokenisation and part-of-speech tagging and compare the inventory of the two subcorpora in terms of frequently used N-grams and lexical richness, among other aspects. We conclude the article by describing possible applications of the study in foreign language acquisition research and language teaching.
-
From This Site
/content/books/9789027273444-hsm.14.06zindcterms_subject,pub_keyword-contentType:Journal105