- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 25, Issue 4, 2020
International Journal of Corpus Linguistics - Volume 25, Issue 4, 2020
Volume 25, Issue 4, 2020
-
Keyword analysis and the indexing of Aboriginal and Torres Strait Islander identity
Author(s): Monika Bednarekpp.: 369–399 (31)More LessAbstractThis article presents a corpus-driven sociolinguistic study of Redfern Now – the first major television drama series commissioned, written, acted, directed and produced by Indigenous industry professionals in Australia. The study examines whether corpus linguistic keyword analysis can identify evidence for type indexicality (social demographics, personae) and trait indexicality (stance, personality), with particular attention paid to the potential indexing of Aboriginal and Torres Strait Islander identity. More specifically, the study’s goal is to retrieve and analyse words that are associated with varieties of English in Australia, and with Australian Aboriginal Englishes in particular. To this end, a corpus with dialogue from Redfern Now is compared to a reference corpus of US television dialogue. Results show that Redfern Now features the use of easily recognisable and familiar words (e.g. blackfella[s], deadly; kinship terms), but also shows clear variation among characters. The case study concludes by evaluating the use of keyword analysis for identifying indexicality in telecinematic discourse.
-
Speech acts in corpus pragmatics
Author(s): Martin Weisserpp.: 400–425 (26)More LessAbstractIn corpus pragmatics, most of the research into speech acts still tends to be limited to working with the original, highly abstract, speech-act taxonomies devised by ordinary language philosophers like Austin and Searle. The aim of this article is to illustrate how the use of such restricted taxonomies may lead to oversimplified or potentially misleading impressions regarding the communicative functions expressed in spoken interaction, and to demonstrate how a more elaborate taxonomy, the DART taxonomy (Weisser, 2018), may help us gain better insights into the pragmatic strategies that occur in dialogues. To this end, I will draw on a small sample of dialogues, both from a task-oriented domain and unconstrained interaction, and contrast selected speech-act categorisations on the basis of Searle’s and the DART taxonomy, demonstrating the advantages that arise from using a more fine-grained taxonomy to describe complex verbal exchanges.
-
Classifying heuristic textual practices in academic discourse
Author(s): Maria Becker, Michael Bender and Marcus Müllerpp.: 426–460 (35)More LessAbstractIn this paper, we investigate how deep learning techniques can be applied to discourse pragmatics. As a testcase we analyse heuristic textual practices, defined as linguistic implementations of decision routines in research processes in academic discourse. We develop a complex annotation scheme of pragmalinguistic categories on different levels of granularity and manually annotate a corpus of texts across various scientific disciplines. This is the basis for training recurrent neural networks to classify heuristic textual practices. Our experiments show that the annotation categories are robust enough to be recognised by our models which learn similarities of the sentence-surfaces represented as word embeddings. Our study aims at an iterative human-in-the-loop process in which manual-hermeneutic and algorithmic procedures mutually advance the insight process. It underlines the fact that the interaction between manual and automated methods opens up a promising field for further research, allowing interpretative analyses of complex pragmatic phenomena in large corpora.
-
Author and register as sources of variation
Author(s): Václav Cvrček, Zuzana Laubeová, David Lukeš, Petra Poukarová, Anna Řehořková and Adrian Jan Zasinapp.: 461–488 (28)More LessAbstractThis paper investigates the contribution of author/idiolect vs. register/type-of-text – as the most salient factors influencing the final shape of a text – towards explaining the variation observed in Czech texts. Since it is almost impossible to explore the effect of these factors on authentic data, we used elicited letters collected in a fully crossed experimental design (representative sample of 200 authors × four elicitation scenarios serving as a proxy to register variation). The variation encompassed by the elicited texts is analyzed through the lens of a general-purpose multi-dimensional model of Czech. Using triangulation via three established statistical methods and one devised for the purpose of this study, we find that register matters a great deal, explaining 1.5 times as much variation overall as idiolect. This should be taken into account when designing research in sociolinguistics or variation studies in general.
-
Lima or cima?
Author(s): Claudia Posch and Gerhard Ramplpp.: 489–503 (15)More LessAbstractThis paper outlines the construction of the corpus Alpenwort, a large, genre-based corpus of German texts on alpinism. We report on issues related to building the corpus from the Austrian Alpine Club Journal (1869–2010). First, a general description of our data and the project phases from digitization and annotation to publication is given. We focus on the most interesting challenges that the diverse layouts and the extensive use of Fraktur typefacing posed for optical layout recognition and optical character recognition (OCR) as well as post correction. The corrected data was lemmatized and annotated with part-of-speech information including named entities as well as TEI-conformant metadata. The resulting 19.9-million-word corpus is designed to be queried using CQPweb and Hyperbase and can be accessed freely online. Lastly, we give a short roadmap of current and future expansions and improvements as corpus data has been and is being enhanced in follow-up projects.
-
Love, R. (2020). Overcoming Challenges in Corpus Construction: The spoken British National Corpus 2014
Author(s): Jiawei Wangpp.: 504–510 (7)More LessThis article reviews Overcoming Challenges in Corpus Construction: The Spoken British National Corpus 2014
-
Lange, C., & Leuckert, S. (2019). Corpus Linguistics for World Englishes: A Guide for Research
Author(s): Guyanne Wilsonpp.: 511–516 (6)More LessThis article reviews Corpus Linguistics for World Englishes: A Guide for Research
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less