- Home
- Book Series
- [, Studies in Corpus Linguistics]
[, Studies in Corpus Linguistics]
<p>SCL focuses on the use of corpora throughout language study, the development of a quantitative approach to linguistics, the design and use of new tools for processing language texts, and the theoretical implications of a data-rich discipline. </p>
1 - 50 of 127 results
-
-
Academic and Professional Discourse Genres in Spanish
Editor(s): Giovanni ParodiPublication Date May 2010show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume offers a description and a deep examination of discourse genres across four disciplines (Psychology, Social Work, Industrial Chemistry, and Construction Engineering), in academic and professional settings. The study is based on one of the largest available corpus on disciplinary written discourse in Spanish (PUCV-2006 Corpus of Spanish containing almost 60 million words). Twelve chapters range from the theoretical guiding principles of the research in terms of genre conception, the detailed description of each corpus (academic and professional), computational analysis from multi-dimensional perspectives, and the qualitative analysis of two specialized genres (University Textbook and Disciplinary Text) in terms of their rhetorical macro-moves and moves. Theoretically speaking, a multi-dimensional perspective (social, linguistic and cognitive) is emphasized and special attention to the cognitive nature of discourse genres is supported.
-
-
-
The Academic Discourse of Mechanical Engineering
Author(s): Thi Ngoc Phuong Le, Minh Man Pham and Michael BarlowPublication Date March 2023show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume examines rhetorical conventions employed in mechanical engineering research to understand the knowledge-making principles of the discipline, as well as their expression within the research article. In particular, the study analyses the organisational patterns of mechanical engineering research articles using Swales’s conceptualisation of moves and steps. In addition, the research identifies the phraseology associated with specific moves and steps. The study draws on a corpus of 120 mechanical engineering research articles, equally distributed across two sub-disciplines (mechanical systems and thermal-fluids engineering), three research traditions (experimental, theoretical and mixed methods), and two publication periods (2002–2006 and 2012–2016). It adopts an integrated methodology, intertwining various approaches and perspectives including corpus linguistics, move analysis, discourse analysis and interviews to address two main strands of research enquiry: (i) What are the properties of the rhetorical structures in terms of range, frequency, and length for each section of mechanical engineering research articles? (ii) What effect does sub-discipline, research tradition and publication date have on the rhetorical structure of research articles?
-
-
-
Adjective Complementation
Author(s): Ilka MindtPublication Date May 2011show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This is the first empirical study to focus on adjectives complemented by that-clauses. The in-depth analysis of more than 50,000 cases taken from the British National Corpus gives comprehensive insights into hitherto neglected relations of lexis and grammar. The result of this corpus-driven study is a novel classification of adjectives based on co-occurrence patterns and corroborated with the help of statistical means. The inductive analysis of corpus data offers new perspectives on and innovative descriptions of well-known phenomena of English grammar, such as extraposition or the resultative construction so…that. It is based on a new methodological approach, which looks at mutual relations of both lexis and grammar in unprecedented ways.
-
-
-
Advances in Corpus-based Contrastive Linguistics
Editor(s): Karin Aijmer and Bengt AltenbergPublication Date March 2013show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Contrastive studies have experienced a dramatic revival in the last decades. By combining the methodological advantages of computer corpus linguistics and the possibility of contrasting texts in two or more languages, the structure and use of languages can be explored with greater accuracy, detail and empirical strength than before. The approach has also proved to have fruitful practical applications in a number of areas such as language teaching, lexicography, translation studies and computer-aided translation. This volume contains twelve studies comparing linguistic phenomena in English and seven other languages. The topics range from comparisons of specific lexical categories and word combinations to syntactic constructions and discourse phenomena such as cohesion and thematic structure. The studies highlight similarities and differences in the use, semantics and functions of the compared items, as well as the emergence of new meanings and language change. The emphasis varies from purely linguistic studies to those focusing on practical applications.
-
-
-
Advances in Corpus-based Research on Academic Writing
Editor(s): Ute Römer, Viviana Cortes and Eric FriginalPublication Date February 2020show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume showcases some of the latest research on academic writing by leading and up-and-coming corpus linguists. The studies included in the volume are based on a wide range of corpora spanning first and second language academic writing at different levels of writing expertise, containing texts from a variety of academic disciplines (and sub-disciplines) and of different academic registers. Particularly novel aspects of the collection are the inclusion of research that combines rhetorical moves with multi-dimensional analysis, studies that cover both fixed and variable phraseological items (lexical bundles, phrase-frames, constructions), and work that is based on corpora of English as an academic lingua franca. Going beyond merely summarizing their findings, the authors also discuss what their research means for academic writing practice and pedagogical settings. The volume will be of interest to researchers, students, and teachers who would like to expand their knowledge of how academic writing functions and what it looks like in a variety of contexts.
-
-
-
Advances in Sign Language Corpus Linguistics
Editor(s): Ella WehrmeyerPublication Date April 2023show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This collected volume showcases cutting-edge research in the rapidly developing area of sign language corpus linguistics in various sign language contexts across the globe. Each chapter provides a detailed account of particular national corpora and methodological considerations in their construction. Part 1 focuses on corpus-based linguistic findings, covering aspects of morphology, syntax, multilingualism, and regional and diachronic variation. Part 2 explores innovative solutions to challenges in building and annotating sign language corpora, touching on the construction of comparable sign language corpora, collaboration challenges at the national level, phonological arrangement of digital lexicons, and (semi-)automatic annotation. This unique volume documenting the growth in breadth and depth within the discipline of sign language corpus linguistics is a key resource for researchers, teachers, and postgraduate students in the field of sign language linguistics, and will also provide valuable insights for other researchers interested in corpus linguistics, Construction Grammar, and gesture studies.
-
-
-
Applications of Pattern-driven Methods in Corpus Linguistics
Editor(s): Joanna Kopaczyk and Jukka TyrkköPublication Date March 2018show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The use of corpora has conventionally been envisioned as being either corpus-based or corpus-driven. While the formal definition of the latter term has been widely accepted since it was established by Tognini-Bonelli (2001), it is often applied to studies that do not, in fact, fullfil the fundamental requirement of a theory-neutral starting point. This volume proposes the term pattern-driven as a more precise alternative. The chapters illustrate a variety of methods that fall under this broad methodology, such as the extraction of lexical bundles, POS-grams and semantic frames, and demonstrate how these approaches can uncover new understandings of both synchronic and diachronic linguistic phenomena.
-
-
-
Applying Corpora in Teaching and Learning Romance Languages
Editor(s): Henry Tyne and Stefania SpinaPublication Date November 2025show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Applying Corpora in Teaching and Learning Romance Languages is the first major volume dedicated to the use of corpora in teaching and learning Romance languages. Covering four Mediterranean Romance languages – French, Italian, Spanish, and Catalan – the volume provides a thematically structured exploration of applying corpora, with sections on spoken language, writing and translation, data-driven learning, acquisition, and classroom practice. The chapters making up the volume engage critically with both historical issues and contemporary methodologies, serving to illustrate how corpora can enhance language teaching and student engagement. With its broad scope and range of insightful research findings, this volume lays the foundations for further research in applying corpus linguistics to Romance languages. A must-read for researchers and teachers wishing to engage with corpus use in the teaching and learning of Romance languages.
-
-
-
Automatic Treatment and Analysis of Learner Corpus Data
Editor(s): Ana Díaz-Negrillo, Nicolas Ballier and Paul ThompsonPublication Date December 2013show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book is a critical appraisal of recent developments in corpus linguistics for the analysis of written and spoken learner data. The twelve papers cover an introductory critical appraisal of learner corpus data compilation and development (section 1); issues in data compilation, annotation and exchangeability (section 2); automatic approaches to data identification and analysis (section 3); and analysis of learner corpus data in the light of recent models of data analysis and interpretation, especially recent automatic approaches for the identification of learner language features (section 4). This collection is aimed at students and researchers of corpus linguistics, second language acquisition studies and quantitative linguistics. It will significantly advance learner corpus research in terms of methodological innovation and will fill in an important gap in the development of multidisciplinary approaches (for learner corpus studies).
-
-
-
Beyond Concordance Lines
Editor(s): Pascual Pérez-Paredes and Geraldine MarkPublication Date December 2021show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:In over 30 years of data-driven learning (DDL) research, there has been a growing sophistication in the ways we collect, analyse, and put corpus data to use. This volume takes a three-fold perspective on DDL. It first looks at DDL and its role in informing language learning theory and how it might shed light on the language development process; secondly it addresses how DDL can help us characterise learner language and inform teaching accordingly, and thirdly it showcases practical applications for the use of DDL in classrooms. The contributors to this volume examine a variety of instructional settings and languages across the world. They reflect on theoretical, methodological and classroom implications using both novel and established language learning theories, natural language processing (NLP), longitudinal research designs, and a variety of language learning targets. The present volume is an invitation from some of the leading researchers in DDL to reflect on the research avenues that will define the field in the coming years.
-
-
-
Biomedical English
Editor(s): Isabel Verdaguer, Natalia Judith Laso and Danica SalazarPublication Date June 2013show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The corpus-based studies in this volume explore biomedical research writing in English from a variety of perspectives. The articles in this collection delve into the lexicographic issues involved in building an electronic database of collocations and lexical bundles, offer insight on the teaching and learning of prototypical multiword units of meaning in biomedical discourse, and view written scientific English through the lens of such diverse fields as phraseology, metaphor, gender and discourse analysis. The research presented in this book forms the theoretical and methodological foundation of SciE-Lex, a lexical database of collocations and prefabricated expressions designed to help scientists write scientific papers in English accurately. The concluding chapter on FrameNet addresses frame semantics, whose application to the cross-linguistic study of scientific language will open new and promising avenues of research in the study of specialized languages.
-
-
-
Broadening the Spectrum of Corpus Linguistics
Editor(s): Susanne Flach and Martin HilpertPublication Date November 2022show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume presents a snapshot of the current state of the art of research in English corpus linguistics. It contains selected papers from the 40th ICAME conference in 2019 and features contributions from experts in synchronic, diachronic, and contrastive linguistics, as well as in sociolinguistics, phonetics, discourse analysis, and learner language. The volume showcases the particular strengths of research in the ICAME tradition. The papers in this volume offer new insights from the reanalysis of new data types, methodological refinements and advancements of quantitative analysis, and from taking new perspectives on ongoing debates in their respective fields.
-
-
-
Building and Using the Siarad Corpus
Author(s): Margaret Deuchar, Peredur Webb-Davies and Kevin DonnellyPublication Date May 2018show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book is a research monograph divided into two parts. The first part describes the methods used to build the first sizeable corpus of informal conversational data collected from bilingual speakers of Welsh and English: Siarad. The second part describes the linguistic analysis of data from this corpus (available at bangortalk.org.uk). The information in Part One will be useful as a ‘how to’ manual on building a bilingual spoken corpus, including methods of data collection, transcription, glossing and analysis. The findings reported in Part Two throw new light on the debate regarding code-switching vs. borrowing, the application of the Matrix Language Framework (MLF) to the grammar of Welsh-English code-switching, the extralinguistic factors influencing variation in quantity of code-switching, and the extent to which the grammar of Welsh is changing in contact with English. Additional findings by other researchers using the corpus are also reported, and possible future directions are discussed.
-
-
-
C-ORAL-ROM
Editor(s): Emanuela Cresti and Massimo MonegliaPublication Date May 2005show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The C-ORAL-ROM book and DVD provide a unique set of comparable corpora of spontaneous speech for the main Romance languages, French, Italian, Portuguese and Spanish. The corpora are accompanied by comparative linguistic studies, models and standard linguistic measures of spoken language variability. Each corpus is built to the same design using identical sampling techniques, and each corpus is presented in multimedia format, allowing simultaneous access to aligned acoustic and textual information. Texts are headed with information about provenance, participants, etc. and the transcriptions show changes of speaker. Speech acts are tagged according to the evidence of prosodic criteria. Each corpus totals 300,000 words and presents formal and informal speech in a variety of contexts of use, dialogue structure and text genres, semantic domains and speech act typologies. The corpora have great statistical relevance for spoken language structures and can address key issues in human language technology such as speech recognition in unrestricted discourse, the suitability of speech synthesis in natural prosody, and multilingual applications of the spoken language interface. The work provides new data and innovative theoretical perspectives that are relevant for corpus linguistics, romance linguistics, syntactic theory, speech and prosody research, and second language acquisition.
The original C-ORAL-ROM DVD was made to run under Windows XP when Windows 7 and 8 were not yet in existence. A new version of WINPITCH-C-ORAL-ROM makes it possible to run the C-ORAL-ROM DVD under Windows 7 and 8. It can be downloaded from www.winpitch.com/
-
-
-
Challenges in Corpus Linguistics
Editor(s): Mark Kaunisto and Marco SchilkPublication Date September 2024show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book contributes to the discussion of challenges faced in different areas of corpus linguistics, namely the compilation, annotation, and analysis of linguistic corpora. In a field of growing corpus sizes and expanding possibilities of gathering data, some old issues persist, while at the same time new problems have emerged. As the compilation and study of language corpora gets increasingly sophisticated and complex, continuous attention on ways of dealing with the data in question and challenges in text selection and interpretation is needed. The contributions to this volume address problems relating to a variety of areas in corpus linguistic study, including corpus annotation, data variability, learner language, social media texts, and database utilization. The authors provide critical overviews and research-based analyses, discuss the nature of some of the common pitfalls, and offer solutions to existing problems.
-
-
-
Collocations in a Learner Corpus
Author(s): Nadja NesselhaufPublication Date January 2005show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Collocations are both pervasive in language and difficult for language learners, even at an advanced level. In this book, these difficulties are for the first time comprehensively investigated. On the basis of a learner corpus, idiosyncratic collocation use by learners is uncovered, the building material of learner collocations examined, and the factors that contribute to the difficulty of certain groups of collocations identified. An extensive discussion of the implications of the results for the foreign language classroom is also presented, and the contentious issue of the relation of corpus linguistic research and language teaching is thus extended to learner corpus analysis.
-
-
-
Colouring Meaning
Author(s): Gill PhilipPublication Date February 2011show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Primarily focused on idioms and other figurative phraseology, Colouring Meaning describes how the meanings of established phrases are enhanced, refocused and modified in everyday language use. Unlike many studies of creativity in language, this book-length survey addresses the matter at several levels, from the purely linguistic level of collocation, through its abstractions in colligation and semantic preference, to semantic prosody and connotation. This journey through both linguistic and cognitive levels involves the examination of habitual language and its exploitations, both mundane and colourful, explaining the phenomena observed in terms of current psycholinguistic research as well as corpus linguistics theory and analysis. The relationships between meaning in text and meaning in the mind are discussed at length and extensively illustrated with worked case studies to offer the reader a comprehensive overview of metaphorical and other secondary meanings as they emerge in real-world communicative situations.
-
-
-
Complexity, Accuracy and Fluency in Learner Corpus Research
Editor(s): Agnieszka Leńko-Szymańska and Sandra GötzPublication Date December 2022show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume illustrates the high potential of learner corpus investigations for research into the CAF triad by presenting eleven original learner corpus-based studies which are set within solid theoretical frameworks, examine learner corpora with state-of-the-art analytical techniques and yield highly interesting findings. The volume’s major strength lies in the range of issues it undertakes and in its interdisciplinary thematic novelty. The chapters collectively address all three dimensions of L2 performance related to different linguistic subsystems (i.e. lexical, phraseological and grammatical complexity and accuracy, along with fluency) as well as the interactions among these constructs. The studies are based on data drawn from carefully compiled learner corpora which are analysed with the help of diverse corpus-based methods. The theoretical discussions and the empirical results shall contribute to the advancement of the fields of SLA and writing and speech research and shall inspire further investigations in the area of the CAF triad.
-
-
-
Conjunctive Markers of Contrast in English and French
Author(s): Maïté DupontPublication Date June 2021show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Situated at the interface between corpus linguistics and Systemic Functional Linguistics, this volume focuses on conjunctive markers expressing contrast in English and French. The frequency and placement patterns of the markers are analysed using large corpora of texts from two written registers: newspaper editorials and research articles. The corpus study revisits the long-standing but largely unsubstantiated claim that French requires more explicit markers of cohesive conjunction than English and shows that the opposite is in fact the case. Novel insights into the placement preferences of English and French conjunctive markers are provided by a new approach to theme and rheme that attaches more importance to the rheme than previous studies. The study demonstrates the significant benefits of a combined corpus and Systemic Functional Linguistics approach to the cross-linguistic analysis of cohesion.
-
-
-
Corpora and Discourse
Editor(s): Annelie Ädel and Randi ReppenPublication Date June 2008show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book brings together contributions from a diverse collection of scholars who explore different ways of combining corpus linguistics and discourse analysis, studying discourse at the prosodic, lexical, and textual levels. Both spoken and written discourse are investigated in a variety of settings, including academia, the workplace, news, and entertainment. Not only does the volume offer a rich sample of English-language discourse from around the world, including international, learner, and non-standard varieties of English, but it also covers a range of topics and methods. This book will be of particular interest to researchers and students specializing in discourse studies, English linguistics, and corpus linguistics.
-
-
-
Corpora and Language Learners
Editor(s): Guy Aston, Silvia Bernardini and Dominic StewartPublication Date November 2004show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpus-aided language pedagogy is one of the central application areas of corpus methodologies, and a test bed for theories of language and learning. This volume provides an overview of current trends, offering methodological and theoretical position statements along with results from empirical studies. The relationship between corpora and learning is examined from complementary perspectives — the study of learner language, the didactic use of corpus findings, and the interaction between corpora and their users. Reflections on current theory and technology open and close the volume.With its focus on the learner and the learning setting, Corpora and Language Learners is addressed to corpus linguists with an interest in learner language, applied linguists wishing to expand their understanding of corpora and their pedagogic potential, and language teachers wishing to critically assess the relevance of work in this field.
This volume grew out of selected presentations at the 5th Teaching and Language Corpora conference in Bertinoro, Italy.
-
-
-
Corpora and Language Teaching
Editor(s): Karin AijmerPublication Date January 2009show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The articles in this edited volume represent a broad coverage of areas. They discuss the role and effectiveness of corpora and corpus-linguistic techniques for language teaching but also deal with broader issues such as the relationship between corpora and second language teaching and how the different perspectives of foreign language teachers and applied linguists can be reconciled. A number of concrete examples are given of how authentic corpus material can be used for different learning activities in the classroom. It is also shown how specific learner problems for example in the area of phraseology can be studied on the basis of learner corpora and textbook corpora. On the basis of learner corpora of speech and writing it is further shown that even advanced learners of English are uncertain about stylistic and text type differences.
-
-
-
Corpora and Rhetorically Informed Text Analysis
Editor(s): David West Brown and Danielle Zawodny WetzelPublication Date June 2023show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpora and Rhetorically Informed Text Analysis explores applications of rhetorically informed approaches to corpus research. Bringing together contributions from scholars in a variety of fields, it takes up questions of how theories and traditions in rhetorical analysis can be integrated with corpus techniques in order to enrich our understanding of language use, variation, and history. The studies included in this volume shed light on areas as diverse as student academic writing, political discourse, and the digital humanities. These studies all make use of a dictionary-based tagger called DocuScope, which recognizes tens-of-millions of words and phrases and slots them into categories based on their rhetorical functions. While DocuScope provides a through-line that both links the studies’ various analytical procedures and primes their rhetorical insights, the volume is about more than the explanatory power of a single tool. It demonstrates how rhetorically informed approaches can complement more established corpus methodologies, underscoring their combined potential.
-
-
-
Corpora and the Changing Society
Editor(s): Paula Rautionaho, Arja Nurmi and Juhani KlemolaPublication Date April 2020show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book showcases eleven studies dealing with corpora and the changing society. The theme of the volume reflects the fact that changes in society lead to changes in language and vice versa. Focusing on the English language, be it from Old English to the present, or a shorter time span in the immediate past, the contributors in this volume use a variety of corpus methods to address the two patterns of change. The cross-fertilization of cultural studies and corpus linguistics, we hope, is beneficial for both parties, as corpus linguistics offers a vast array of materials and methods to investigate cultural and societal change, while cultural studies provide the theoretical background on which to build our research. The studies included in the present volume illustrate the potential avenues and the merits of combining changing language and changing societies.
-
-
-
Corpora, Constructions, New Englishes
Author(s): Samantha LaportePublication Date June 2021show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book takes an integrated approach to the fields of Corpus Linguistics, Construction Grammar, and World Englishes through a thorough constructional and corpus-based examination of the patterning of the versatile high-frequency verb make in British English and New Englishes. It contributes to Construction Grammar theory by adopting a verb-based, rather than construction-based, perspective on argument structure. This allows the probing of the interface between verb-independent generalizations and item-specificity from an underexplored angle that offers new insights into the shape of the constructicon. From a variationist perspective, it seeks to (i) identify features of New Englishes and gauge whether these features exhibit traces of conventionalization, and (ii) assess whether the degree of institutionalization of the New Englishes correlates with linguistic behavior, both from a social and cognitive perspective, thereby contributing to the budding effort to integrate the cognitive and social dimensions into the modeling of linguistic variation in World Englishes.
-
-
-
Corpora, Grammar and Discourse
Editor(s): Nicholas Groom, Maggie Charles and Suganthi JohnPublication Date October 2015show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpus linguistics has had a revolutionary impact on grammar and discourse research. Not only has it opened up entirely new theoretical perspectives and methodological possibilities for both fields, but it has also to a considerable extent erased the boundaries that have traditionally been drawn between them. This book showcases a variety of current corpus-based approaches to the study of grammar and discourse, and makes a case for seeing grammar and discourse as fundamentally inter-related phenomena. The book features contributions from leading experts in cognitive linguistics, construction grammar, critical discourse studies, genre and register analysis, phraseology, language learning and teaching, languages for specific purposes, second language acquisition, sociolinguistics, systemic functional linguistics and text linguistics. An essential reference point for future research, Corpora, Grammar and Discourse has been edited in honour of Susan Hunston, whose own work has consistently pushed at the boundaries of corpus-based research on grammar and discourse for over three decades.
-
-
-
Corpus and Context
Author(s): Svenja AdolphsPublication Date April 2008show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpus and Context explores the relationship between corpus linguistics and pragmatics by discussing possible frameworks for analysing utterance function on the basis of spoken corpora. The book articulates the challenges and opportunities associated with a change of focus in corpus research, from lexical to functional units, from concordance lines to extended stretches of discourse, and from the purely textual to multi-modal analysis of spoken corpus data. Drawing on a number of spoken corpora including the five million word Cambridge and Nottingham Corpus of Discourse in English (CANCODE, funded by CUP (c)), a specific speech act function is being explored using different approaches and different levels of analysis. This involves a close analysis of contextual variables in relation to lexico-grammatical and discoursal patterns that emerge from the corpus data, as well as a wider discussion of the role of context in spoken corpus research.
-
-
-
Corpus and Sociolinguistics
Author(s): Bróna MurphyPublication Date February 2010show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Age is by far the most underdeveloped of the sociolinguistic variables in terms of research literature. To-date, research on age has been patchy and has generally focused on the early life-stages such as childhood and adolescence, ignoring, for the most part, healthy adulthood as a stage worthy of scrutiny. This book examines the discourse of adulthood and accounts for sociolinguistic variation, with regards to age and gender, through the exploration of a 90,000 word age-and gender-differentiated spoken corpus of Irish English. The book explores both the distribution and use of a number of high frequency pragmatic features of spoken discourse that appear as key items in the corpus. Part 1 of the book provides an introduction, a theoretical overview of age as a sociolinguistic variable and a description on how to compile a small spoken corpus for sociolinguistic research. Part 2 consists of five chapters which investigate and explore key features such as hedges, vague category markers, intensifiers, boosters and high-frequent items of taboo language in relation to the variables, age and gender. The book is of interest to undergraduates or postgraduates taking formal courses in sociolinguistics, applied linguistics, pragmatics or discourse analysis. It is also of interest to students and researchers interested in using corpus linguistics in sociolinguistic research.
-
-
-
Corpus Approaches to Grammaticalization in English
Editor(s): Hans Lindquist and Christian MairPublication Date June 2004show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Grammaticalization is an important concept in general and typological linguistics and a prominent type of explanation in historical linguistics. For historical corpus linguists, grammaticalization theory provides a frame of orientation in their effort to analyze and systematize a fast-accumulating mass of data. Students of grammaticalization have become increasingly aware of the potential of existing corpora and established corpus-linguistic methodology for their work. This book continues and develops the dialogue between the two fields. All the contributions are based on extensive use of various electronic corpora. Relating corpus practices to recent theoretical concerns of grammaticalization studies they deal with grammaticalization and historical sociolinguistics, lexicalization and grammaticalization, layering, frequency, grammaticalization and dialects, degrammaticalization and grammaticalization in a contrastive perspective. The papers show that a synthesis of corpus methodology and grammaticalization studies leads to new and interesting insights about the mechanisms of language change and the communicative functions of language.
-
-
-
Corpus Approaches to Social Media
Editor(s): Sofia Rüdiger and Daria DayterPublication Date November 2020show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:From Twitter to Reddit, Facebook, and WhatsApp – social media is a part of modern everyday life. Studying the language used on social media platforms presents great opportunities as well as challenges to corpus linguists. The contributions in Corpus Approaches to Social Media address technical, ethical, and methodological issues by showcasing in-depth social media studies as conducted by corpus scholars. The chapters are based on a variety of social media platforms and include corpus perspectives on the language of online communities, linguistic variation in short media texts, and the role of images in computer-mediated communication. A particularly strong point of the collection are the detailed accounts of the methodological aspects of working with social media corpora. The volume features research applying traditional corpus linguistic methods to social media data as well as novel and innovative research methods for the analysis of multimodal material and atypical corpus texts.
-
-
-
Corpus Dialectology
Editor(s): Elissa Pustka, Carmen Quijada Van den Berghe and Verena WeilandPublication Date August 2023show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpus Dialectology combines the fields of corpus linguistics and dialectological mapping. It concerns documentation of linguistic variation and mapping of linguistic spaces and boundaries, while ascribing renewed importance to the methodology and the material itself, especially data processing and statistical analysis. This approach considers phenomena that have received little attention to date, such as migration, language contact, mobility and educational level, as well as the differentiation between rural and urban spaces. Transparently described and intersubjectively comprehensible encodings permit the enhancement of dialectometry in the context of Digital Humanities and further development of linguistic theories of variation and change, as well as different levels of structure (phonology, morphosyntax, semantics). This book contains nine chapters on ongoing corpus dialectological research projects. They discuss current issues of data collection, for example the validity of crowdsourced data, explore challenges and possibilities of data analysis and offer theoretical reflections on virtual Romance geolinguistics.
-
-
-
Corpus Interrogation and Grammatical Patterns
Editor(s): Kristin Davidse, Caroline Gentens, Lobke Ghesquière and Lieven VandelanottePublication Date November 2014show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The studies in this volume approach English grammatical patterns in novel ways by interrogating corpora, focusing on patterns in the verb phrase (tense, aspect and modality), the noun phrase (intensification and focus marking), complementation structures and clause combining. Some studies interrogate historical corpora to reconstruct the diachronic development of patterns such as light verb constructions, verb-particle combinations, the be a-verbing progressive and absolute constructions. Other studies analyse synchronic datasets to typify the functions in discourse of, amongst others, tag questions and it-clefts, or to elucidate some long-standing problems in the syntactic analysis of verbal or adjectival complementation patterns, thanks to the empirical detail only corpora can provide. The volume documents the practices that have been developed to guarantee optimal representativeness of corpus data, to formulate definitions of patterns that can be operationalized in extractions, and to build dimensions of variation such as text type and register into rich grammatical descriptions.
-
-
-
Corpus Linguistics and African Englishes
Editor(s): Alexandra U. Esimaje, Ulrike Gut and Bassey E. AntiaPublication Date February 2019show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Corpus linguistics has become one of the most widely used methodologies across the different linguistic subdisciplines; especially the study of world-wide varieties of English uses corpus-based investigations as one of the chief methodologies. This volume comprises descriptions of the many new corpus initiatives both within and outside Africa that aim to compile various corpora of African Englishes. Moreover, it contains cutting-edge corpus-based research on African Englishes and the use of corpora in pedagogic contexts within African institutions. This volume thus serves both as a practical introduction to corpus compilation (Part I of the book), corpus-based research (Part II) and the application of corpora in language teaching (Part III), and is intended both for those researchers not yet familiar with corpus linguistics and as a reference work for all international researchers investigating the linguistic properties of African Englishes.
-
-
-
Corpus Linguistics at Work
Author(s): Elena Tognini-BonelliPublication Date April 2001show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The book offers a combined discussion of the main theoretical, methodological and application issues related to corpus work. Thus, starting from the definition of what is a corpus and why reading a corpus calls for a different methodology from reading a text, the underlying assumptions behind corpus work are discussed.
The two main approaches to corpus work are discussed as the “corpus-based” and the “corpus-driven” approach and the theoretical positions underlying them explored in detail. The book adopts and exemplifies the parameters of the corpus-driven approach and posits a new unit of linguistic description defined systematically in the light of corpus evidence. The applications where the corpus-driven approach is exemplified are language teaching and contrastive linguistics. Alternating between practical examples and theoretical evaluation, the reader is led step-by-step to a detailed understanding of the issues involved in corpus work and, at the same time, tempted to explore for himself some of the major applications where a corpus-driven methodology can reveal unprecedented insights into linguistic patterning.
-
-
-
The Corpus Linguistics Discourse
Editor(s): Anna Čermáková and Michaela MahlbergPublication Date December 2018show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:With an ever-growing body of corpus linguistic tools, resources and applications, it becomes increasingly important to reflect critically on the underlying assumptions that corpus linguistics is based on. Focusing on meaning and methods, this book tackles fundamental concepts and approaches that define the discourse of the field. Internationally renowned contributors address topics that range from the history of corpus linguistics to contrastive perspectives between languages, to interpreting patterns in corpora as evidence of both mainstream discourses and individual voices within them. This collection not only adds to our understanding of the fundamentals of corpus linguistics, it also brings innovative meanings to the corpus linguistics discourse. It has been edited in honour of Wolfgang Teubert, who for decades has been a significant voice in this discourse.
-
-
-
Corpus Perspectives on Patterns of Lexis
Editor(s): Hilde Hasselgård, Jarle Ebeling and Signe Oksefjell EbelingPublication Date June 2013show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:A hallmark of corpus linguistics is the study of patterns of language use. The studies presented in this volume all use corpora to investigate patterns of lexis from various perspectives. The first section, “Sequence and Order”, presents theoretical and practical aspects of the linguist’s task of uncovering the principles that determine such patterns. The next section, “Competing Constructions”, discusses the relationship between lexical patterns with similar meanings in the light of diachronic, regional and register variation. New developments in terms of lexicogrammatical meaning and patterning are dealt with in the section “Emerging Patterns”. The final section, “Correlating patterns and meaning”, discusses ways in which meaning can be studied in corpus data despite the lack of narrowly defined search terms. Though situated at different points on a continuum between lexical and grammatical emphasis, the studies all confirm the inseparability of lexis and grammar.
-
-
-
Corpus Use in Cross-linguistic Research
Editor(s): Marlén Izquierdo and Zuriñe Sanz-VillarPublication Date November 2023show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Cross-linguistic research is a fruitful field of language inquiry that has benefited enormously from the use of corpora. As sources of linguistic data of various kinds and as tools for language processing, corpora have shaped the development of cross-linguistic research, enabling both language description and practical applications. This volume contains twelve studies that emphasize the usefulness and usability of parallel corpora in accurately exploring the structure and use of seven under-researched languages and language varieties. The first part emphasizes the role of corpus-based descriptive analyses at the lexicogrammatical and discursive levels, as a first step on the way towards concrete applications like translation or language teaching. The second part focuses on the role of parallel-corpus-based language processing techniques and applications that facilitate professional communication. This book will be of interest to scholars in contrastive linguistics, translation studies, discourse analysis, language teaching, and natural language processing.
-
-
-
Corpus, Cognition and Causative Constructions
Author(s): Gaëtanelle GilquinPublication Date March 2010show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:English causative constructions with cause, get, have and make are often mistakenly presented as (quasi-)synonymous and more or less interchangeable. This book demonstrates the value of corpus linguistics in identifying the syntactic, semantic, lexical and stylistic features that are distinctive for each of these constructions. It also underlines the usefulness of providing corpus studies with a solid theoretical foundation by showing how corpus linguistics can be fruitfully combined with cognitive linguistics, which is used both as a starting point for the analysis (top-down approach) and as a framework within which to interpret the corpus results (bottom-up approach). From a methodological point of view, the study illustrates the complementarity of corpus and elicitation data, and offers tools and methods that could be used to investigate other syntactic structures. Finally, the book also has a pedagogical dimension in that it examines how the research findings can be applied to foreign language teaching.
-
-
-
Corpus-based Analyses of the Problem–Solution Pattern
Author(s): Lynne FlowerdewPublication Date February 2008show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book reports research on the Problem-Solution rhetorical pattern, which has to date received very little attention in corpus-based studies. Insights from genre analysis and systemic-functional grammar are also applied to the analysis of the Problem-Solution pattern, thus moving towards a more multi-faceted analysis of corpus data. The pattern is investigated in two specialized corpora of technically-oriented report writing, a professional corpus and a student corpus, using a key word and key-key word analysis. Phraseological analyses of key words in both corpora are presented. Data show that students’ writing lacks a range of lexico-grammatical patternings for expressing the Problem and Solution elements of the pattern. The book concludes with some pedagogic implications and applications of the findings. Suggested concordancing activities are discussed within the context of key issues in the field of data-driven learning.
-
-
-
Corpus-based and Computational Approaches to Discourse Anaphora
Editor(s): Simon Philip Botley and Tony McEneryPublication Date June 2000show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Discourse anaphora is a challenging linguistic phenomenon that has given rise to research in fields as diverse as linguistics, computational linguistics and cognitive science. Because of the diversity of approaches these fields bring to the anaphora problem, the editors of this volume argue that there needs to be a synthesis, or at least a principled attempt to draw the differing strands of anaphora research together. The selected papers in this volume all contribute to the aim of synthesis and were selected to represent the growing importance of corpus-based and computational approaches to anaphora description, and to developing natural language systems for resolving anaphora in natural language.
-
-
-
Corpus-based Approaches to Register Variation
Editor(s): Elena Seoane and Douglas BiberPublication Date December 2021show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:As the first collective volume to focus exclusively on corpus-based approaches to register variation, this book provides an exhaustive account of the range and depth of possibilities that the domain of register variation in English has to offer. It illustrates register variation analysis in different theoretical frameworks, such as Probabilistic Grammar, Systemic Functional Linguistics, and Information Theory, and proposes a new framework within the Text Linguistic Approach: the continuous-situational analytical framework. Several of the contributions apply Multi-Dimensional Analysis to corpus data in order to unveil register (dis)similarities, while others rely on logistic regression models and periodization techniques based on Kullback-Leibler divergence. The volume includes both inter-register and intra-register variation analysis of a wide spectrum of varieties, speakers and periods: British and American English, learner varieties, L2 varieties, and also contains diachronic studies covering early and late Modern English. This broad scope should be a source of inspiration for anyone interested in historical and ongoing register variation in a vast range of varieties of English worldwide.
-
-
-
Corpus-based Research in Applied Linguistics
Editor(s): Viviana Cortes and Eniko CsomayPublication Date January 2015show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume comprises nine contributions that were written by up-and-coming corpus-based researchers with varied areas of expertise, who were all disciples of Douglas Biber sometime in the past two decades. These papers cover a wide variety of linguistic analyses and describe the principles of the Flagstaff school: a careful procedure for language corpora collection with special consideration for corpus size, representativeness, sampling and systematic analysis; the use of computer programming abilities that allow the posing of corpus-based research questions never asked before; and a strong emphasis on the combination of quantitative methods based on sound and innovative statistical procedures complemented with comprehensive qualitative functional analyses of the language. This volume has been edited in honor of Douglas Biber, a pioneer of the American school of corpus-based research.
-
-
-
Corpus-based Research on Variation in English Legal Discourse
Editor(s): Teresa Fanego and Paula Rodríguez-PuentePublication Date February 2019show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume provides a comprehensive overview of the research carried out over the past thirty years in the vast field of legal discourse. The focus is on how such research has been influenced and shaped by developments in corpus linguistics and register analysis, and by the emergence from the mid 1990s of historical pragmatics as a branch of pragmatics concerned with the scrutiny of historical texts in their context of writing. The five chapters in Part I (together with the introductory chapter) offer a wide spectrum of the latest approaches to the synchronic analysis of cross-genre and cross-linguistic variation in legal discourse. Part II addresses diachronic variation, illustrating how a diversity of methods, such as multi-dimensional analysis, move analysis, collocation analysis, and Darwinian models of language evolution can uncover new understandings of diachronic linguistic phenomena.
-
-
-
Corpus-based Studies of Lesser-described Languages
Editor(s): Amina Mettouchi, Martine Vanhove and Dominique CaubetPublication Date May 2015show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume presents new findings based on the analysis of spoken corpora in thirteen different Afro-Asiatic languages – a unique endeavor in the domain of lesser-described languages. It will be of interest to corpus linguists, general linguists, typologists, and linguists specializing in Afro-Asiatic languages. In addition to the rarity of corpus studies based on endangered and lesser-described languages, the volume is remarkable due to its focus on the role of prosody in interaction with several other phenomena, including code-switching and borrowing. Phonology, syntax, and information structure are explored, and the issue of the elaboration of strategies for the typological comparison of corpora is addressed in several papers. The volume also contains a presentation of software development conducted within the scope of the CorpAfroAs project and based upon the widely used ELAN. The sound-indexed, and morphosyntactically-annotated corpora, with their OLAC metadata and several other deliverables can be accessed and searched at http://dx.doi.org/10.1075/scl.68.website.
-
-
-
A Corpus-driven Study of Discourse Intonation
Author(s): Winnie Cheng, Chris Greaves and Martin WarrenPublication Date December 2008show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:The book is the first to apply David Brazil’s Discourse Intonation systems (prominence, tone, key and termination) to the study of a corpus of authentic, naturally-occurring spoken discourses. The Hong Kong Corpus of Spoken English (prosodic) is made up of approximately one million words consisting of four sub-corpora of equal size, namely academic, conversation, business and public. The participants are all adults and typically have either Cantonese or English as their first language. The four Discourse Intonation systems are described in terms of how the system works and how they are manifested in the corpus, both across the sub-corpora and also across speakers in the corpus. The book is accompanied with a CD containing the prosodically transcribed corpus together with iConc which is the software designed and written specifically to interrogate the HKCSE (prosodic). The issues raised and discussed are all of importance in Conversation Analysis, Corpus Linguistics, Discourse Analysis, Discourse Intonation, Pragmatics, and Intercultural Communication.
-
-
-
Corpus-Informed Research and Learning in ESP
Editor(s): Alex Boulton, Shirley Carter-Thomas and Elizabeth Rowley-JolivetPublication Date May 2012show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:These specially-commissioned studies cover corpus-informed approaches to researching, teaching and learning English for Specific Purposes (ESP). The corpora used range from very large published corpora to small tailor-made collections of written and spoken text, as well as parallel and contrastive corpora, in both the hard and softer sciences. Designed to tackle the problems faced by a variety of first- and second-language ESP users (specialised translators, undergraduates, junior and experienced researchers, and language trainers), the breadth of approaches enables treatment of issues central to ESP and corpus research, from corpus compilation and analysis to new applications and data-driven learning. The first full-length book on applied corpus use in France, Corpus-Informed Research and Learning in ESP will be of interest not only to those working in the French context, but to a wide variety of language professionals – teachers, researchers or course designers – in many countries looking at ESP from different linguistic, cultural and educational perspectives.
-
-
-
Cross-linguistic Register Variation
Editor(s): Sylvi Rørvik and Marlén IzquierdoPublication Date February 2026show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:A current trend in contrastive corpus linguistics is to take register variation as a point of departure for identifying similarities and differences across languages. This volume looks back at central previous contributions in this area, and adds to our store of knowledge in the form of nine studies comparing English to five other languages in a wide variety of registers representing written, spoken, and written-to-be-spoken modes of communication. The volume starts with a semi-systematic review of previous research on corpus-based register variation comparing English with the other languages represented in the volume’s studies, which are Dutch, French, German, Norwegian, and Spanish. In the subsequent nine chapters, a variety of topics are explored, ranging from verb and noun phrases to adverbials and other lexico-grammatical constructions. This book will be of interest to scholars, experts, and novices in the fields of contrastive corpus linguistics, register studies, and translation studies.
-
-
-
Crossing Boundaries through Corpora
Editor(s): Sarah Buschfeld, Patricia Ronan, Theresa Neumaier, Andreas Weilinghoff and Lisa WestermayerPublication Date October 2024show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language.
-
-
-
Decoding Movie Language through Multi-Dimensional Analysis and the Grammar of Graphics
Author(s): Pierfranca ForchiniPublication Date August 2025show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:This book offers a comprehensive and refined account of movie discourse through the application of Multi-Dimensional Analysis (MDA) to the American Movie Corpus, a collection of authentic, verified movie dialog transcriptions. Expanding on previous MDA-based research, it broadens both the scope of data and the methodological framework by integrating the Grammar of Graphics to facilitate the interpretation of linguistic findings. The study addresses the longstanding debate on the authenticity of scripted dialog, demonstrating the textual and linguistic proximity between movie language and spontaneous conversation. It includes genre-based and diachronic analyses, offering a rigorous, data-driven perspective on movie language as both a linguistic resource and a tool for teaching spoken grammar. Bridging corpus linguistics, applied linguistics, and media studies, the book provides valuable insights for scholars, educators, and learners interested in spoken language, ELT, and telecinematic discourse, while contributing a novel, visualized approach to empirical language analysis.
-
-
-
Defining Language
Author(s): Geoff BarnbrookPublication Date October 2002show More to view fulltext, buy and share links for: show Less to hide fulltext, buy and share links for:Definition is a basic activity of language, of particular importance to linguists because of its use of language to describe itself. Beyond this inherent significance as a crucial element of language study, definitions also provide a rich potential source of the information needed for Natural Language Processing systems. This book describes an investigation of the subset of general language used in definition sentences and the development of a taxonomy of definition types, a grammar of definition sentences and parsing software which can extract their functional components. The work is based on definition sentences used in one of the dictionaries from the Cobuild range, and the book includes a brief history of the development of monolingual English dictionaries, an assessment of the concepts of sublanguages and local grammars and a full exploration of the results of the analysis and of the present and future applications of the taxonomy, grammar and parser.
-

















































