
Full text loading...
In this paper, we provide an overview of the new GloWbE Corpus — the Corpus of Global Web-based English. GloWbE is based on 1.9 billion words in 1.8 million web pages from 20 different English-speaking countries. Approximately 60 percent of the corpus comes from informal blogs, and the rest from a wide range of other genres and text types. Because of its large size, its architecture and interface, the corpus can be used to examine many types of variation among dialects, which might not be possible with other corpora — including variation in lexis, morphology, (medium- and low-frequency) syntactic constructions, variation in meaning, as well as discourse and its relationship to culture.
This focus article was commented upon by Christian Mair, Joybrato Mukherjee, Gerald Nelson, and Pam Peters, with a response by Mark Davies and Robert Fuchs.