Full text loading...
USD
-
Exploiting a Large Spoken Corpus: An End-user's Way to the BNC
- Source: International Journal of Corpus Linguistics, Volume 4, Issue 1, Jan 1999, p. 29 - 52
Abstract
The British National Corpus (BNC) contains a spoken component of about 10 million words, consisting of spoken language of various kinds produced by different speakers in a variety of situations. Starting from an end-user s perspective, this paper surveys the potential of this resource and some possible problems one might encounter if not fully versed in the details of the compilation and coding plans. Among the issues touched upon are questions relating to the composition of the component, the transcription principles employed, and points relating to the nature and coverage of the mark-up. By way of illustration, examples are drawn from a case study of the variant forms gonna and going to.
© 1999 John Benjamins Publishing Company