Competing target hypotheses in the Falko corpus
Error annotation is a key feature of modern learner corpora. Error identification is always based on some kind of reconstructed learner utterance (target hypothesis). Since a single target hypothesis can only cover a certain amount of linguistic information while ignoring other aspects, the need for multiple target hypotheses becomes apparent. Using the German learner corpus Falko as an example, we therefore argue for a flexible multi-layer stand-off corpus architecture where competing target hypotheses can be coded in parallel. Surface differences between the learner text and the target hypotheses can then be exploited for automatic error annotation.