Volume 19, Issue 4
  • ISSN 1018-2101
  • E-ISSN: 2406-4238


This paper presents EXMARaLDA, a system for the computer-assisted creation and analysis of spoken language corpora. The first part contains some general observations about technological and methodological requirements for doing corpus-based pragmatics. The second part explains the system’s architecture and gives an overview of its most important software components – a transcription editor, a corpus management tool and a corpus query tool. The last part presents some corpora which have been or are currently being compiled with the help of EXMARaLDA.


Article metrics loading...

Loading full text...

Full text loading...


  1. Baumgarten, Nicole , Annette Herkenrath , Thomas Schmidt , Kai Wörner , and Ludger Zeevaert
    (2007) Studying connectivity with the help of computer-readable corpora: Some exemplary analyses from modern and historical, written and spoken corpora. In Jochen Rehbein , Christiane Hohenstein , and Lukas Pietsch (eds.), Connectivity in Grammar and Discourse. Amsterdam: Benjamins Publishing Company, pp. 259-289. doi: 10.1075/hsm.5.16bau
    https://doi.org/10.1075/hsm.5.16bau [Google Scholar]
  2. Braunmüller, Kurt
    (2000) Semikommunikation in phatischen Dialogen. In Bernd Meyer , and Notis Toufexis (eds.), Text/Diskurs, Oralität/Literalität unter dem Aspekt der mehrsprachigen Kommunikation. Beiträge zum Workshop‚ Methodologie und Datenanalyse’. Working Papers in Multilingualism, Series B (11). Hamburg, pp. 101-114.
    [Google Scholar]
  3. Bird, Steven , and Mark Liberman
    (2001) A formal framework for linguistic annotation. Speech Communication33.1,2: 23-60.
    [Google Scholar]
  4. Bird, Steven , and Gary Simons
    (2003) Seven dimensions of portability for language documentation and description. Language79: 557-582.
    [Google Scholar]
  5. Deppermann, Arnulf
    (2000) Ethnographische Gesprächsanalyse: Zu Nutzen und Notwendigkeit von Ethnographie für die Konversationsanalyse. Gesprächsforschung1: 96-124.
    [Google Scholar]
  6. Edwards, Jane
    (1993) Principles and contrasting systems of discourse transcription. In Jane Edwards , and Martin Lampert (eds.), Talking Data – Transcription and Coding in Discourse Research. Hillsdale: Erlbaum, pp. 3-31.
    [Google Scholar]
  7. Ehlich, Konrad , and Jochen Rehbein
    (1976) Halbinterpretative Arbeitstranskriptionen (HIAT). Linguistische Berichte45: 21-41.
    [Google Scholar]
  8. Ehlich, Konrad
    (1993) HIAT - a transcription system for discourse data. In Jane Edwards , and Martin Lampert (eds.), Talking Data – Transcription and Coding in Discourse Research. Hillsdale: Erlbaum, pp. 123-148.
    [Google Scholar]
  9. Isard, Amy , David McKelvie , and Henry Thompson
    (1998) Towards a minimal standard for dialogue transcripts: A New SGML Architecture for the HCRC Map Task Corpus. Proceedings of the 5th International Conference on Spoken Language Processing. Sydney.
    [Google Scholar]
  10. Meyer, Bernd
    (2000) Zur Analyse gedolmetschter Arzt-Patienten-Kommunikation im Krankenhaus. In Bernd Meyer , and Notis Toufexis (eds.), Text/Diskurs, Oralität/Literalität unter dem Aspekt der mehrsprachigen Kommunikation. Beiträge zum Workshop Methodologie und Datenanalyse’. Working Papers in Multilingualism, Series B (11). Hamburg, pp. 45-53.
    [Google Scholar]
  11. Ochs, Elinor
    (1979) Transcription as theory. In Elinor Ochs , and Bambi Schieffelin (eds.), Developmental Pragmatics. New York, San Francisco, London: Academic Press, pp. 43-72.
    [Google Scholar]
  12. Rehbein, Jochen , Wilhelm Grießhaber , Petra Löning , Marion Hartung , and Kristin Bührig
    (1993) Manual für das computergestützte Transkribieren mit dem Programm syncWRITER nach dem Verfahren der Halbinterpretativen Arbeitstranskriptionen (HIAT). Hamburg: Germanisches Seminar, Universität Hamburg.
    [Google Scholar]
  13. Rehbein, Jochen , Thomas Schmidt , Bernd Meyer , Franziska Watzke , and Annette Herkenrath
    (2004) Handbuch für das computergestützte Transkribieren nach HIAT. Working Papers in Multilingualism, Series B (56). Hamburg.
    [Google Scholar]
  14. Schmidt, Thomas
    (2005) Time-based data models and the Text Encoding Initiative's guidelines for transcription of speech. Working Papers in Multilingualism, Series B (62). Hamburg.
    [Google Scholar]
  15. Schmidt, Thomas , and Jasmine Bennöhr
    (2008) Rescuing legacy data. Language Documentation & Conservation2.1: 109-129.
    [Google Scholar]
  16. Teubert, Wolfgang
    (2005) My version of corpus linguistics. International Journal of Corpus Linguistics1/2005.
    [Google Scholar]
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error