Visit www.benjamins.com

Technological and methodological challenges in creating, annotating and sharing a learner corpus of spoken German

MyBook is a cheap paperback edition of the original book and will be sold at uniform, low price.
This Chapter is currently unavailable for purchase.
Abstract

This article discusses questions concerning the creation, annotation and sharing of spoken language corpora. We use the Hamburg Map Task Corpus (HAMATAC), a small corpus in which advanced learners of German were recorded solving a map task, as an example to illustrate our main points. We first give an overview of the corpus creation and annotation process including recording, metadata documentation, transcription and semi-automatic annotation of the data. We then discuss the manual annotation of disfluencies as an example case in which many of the typical and challenging problems for data reuse – in particular the reliability of interpretative annotations – are revealed.

References

/content/books/9789027273444-04hed
dcterms_subject,pub_keyword
6
3
Loading
This is a required field
Please enter a valid email address