- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 6, Issue, 2001
International Journal of Corpus Linguistics - Volume 6, Issue 2, 2001
Volume 6, Issue 2, 2001
-
Designing CoSIH: The Corpus of Spoken Israeli Hebrew
Author(s): Shlomo Izre'el, Benjamin Hary and Giora Rahavpp.: 171–197 (27)More LessThis paper describes the initial design of the Corpus of Spoken Israeli Hebrew (CoSIH). CoSIH will attempt to include a representation of most varieties of spoken Hebrew as it is used in Israel today. CoSIH is designed to consist of two complementary corpora: a main corpus and a supplementary corpus. The main corpus, which will comprise about 90% of the entire collection, will be sampled statistically. For analytical purposes it will use a conceptual tool in the form of a multidimensional matrix combining demographic and contextual tiers. The combined demographic and contextual design will be capable of showing the distribution of speech types in various subgroups of the population. The supplementary corpus will include about 10% of the collected data, and will add to the statistically-sampled corpus some targeted demographically sampled texts and a contextually designed collection. This design is culturally dependent to suit the special structure of the Israeli Hebrew speech community and thus includes both native and non-native speakers of Hebrew. Nonetheless, the principles governing this design are such that they would service study of many other speech communities, to the extent that the design itself may be employed for other corpora with only slight modifications.
-
Lexical Constellations: What Collocates Fail to Tell
Author(s): Pascual Cantos-Gomez and Aquilino Sánchezpp.: 199–228 (30)More LessThe aim of this paper is to shed new light on collocational analysis, reviewing the main stream of research on this issue and trying to overcome some of its intrinsic problems, such as determining the optimal span and partially explaining the reason for undesired collocates (statistically significant collocates, though lexically and semantically not related to the node word). The idea is not just to calculate significant collocates of a chosen node word or to elucidate which is the best statistical procedure to achieve this goal. The aim is to investigate the way words socialise with other words, forming complex network-like structures or units: lexical constellations. This behaviour cannot be explained solely on a grammatical and/or semantic basis or even based on the present state of the art in collocation research.
-
Contrastive and Comparable Corpora: Quantitative Aspects
Author(s): Anatole Shaikevichpp.: 229–255 (27)More LessThis paper draws attention to the complexity of problems arising in statistical linguistics when it must compare various corpora. Those problems are discussed from the point of view of distributional statistical analysis of texts; that is, a set of formal procedures with a minimum of preconceived linguistic knowledge. The terminological distinction between contrastive and comparable corpora is introduced.
-
The Functions of Actually in a Corpus of Intercultural Conversations
Author(s): Winnie Cheng and Martin Warrenpp.: 257–280 (24)More LessUsing a corpus of naturally occurring conversations between native and non-native speakers of English in Hong Kong, we examine the use of actually in intercultural conversations. The frequencies with which the two groups of speakers use actually and the functions it performs are compared and contrasted. Our findings suggest that Hong Kong Chinese speakers of English use actually far more frequently than native speakers of English. The patterns of usage are remarkably similar in certain respects but there are differences in use and in the position actually occupies in utterances which in turn can affect the way that it functions. Explanations are offered for the differences in usage between the two groups of speakers.
-
Recent Progress in Corpus Linguistics in China
Author(s): Jianxin Wangpp.: 281–304 (24)More LessThis paper discusses some of the new developments in corpus linguistics in China. In the area of Chinese corpus compilation it presents large-scale text databases, representative corpora, annotated corpora, lexical databases for information processing, phonological, dialectal, spoken and other specialized corpora. In connection with the analysis and annotation of Chinese corpora, the characteristics of the Chinese language, word segmentation, tagging, parsing, and some corpus analytical systems are described. Concerning English corpus studies, some corpora of English as a Foreign Language and corpus-based research are depicted. On this basis tentative conclusions are drawn.
-
HKCAC: The Hong Kong Cantonese Adult Language Corpus
Author(s): Man-Tak Leung and Sam-Po Lawpp.: 305–325 (21)More LessAn adult language corpus of spoken Hong Kong Cantonese (HKCAC) has recently been developed consisting of spontaneous speech recorded from phone-in programs and forums on the radio in Hong Kong. The database represents the speech of a total of sixty-nine speakers in addition to the program hosts, and has approximately 170,000 characters. It is believed that HKCAC will be of great value to linguists who are interested in studying Cantonese, and speech therapists and educators who work with the Cantonese speaking population. A search engine with a user-friendly interface has also been developed by using FileMaker Pro 4.0 (Chinese version). Apart from the basic frequency information and the display of search results in KWAL (Key Word And Line) format, the search engine also allows users to search for various phonetic realizations of a particular character or the set of characters associated with a particular syllable. The content and structure of the corpus, and the overall architecture as well as the technical aspects of the search engine are described. Search procedures are illustrated with examples. The paper ends with a discussion of the future development of HKCAC.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false
-
-
Comparing Corpora
Author(s): Adam Kilgarriff
-
- More Less