- Home
- e-Journals
- International Journal of Corpus Linguistics
- Previous Issues
- Volume 11, Issue, 2006
International Journal of Corpus Linguistics - Volume 11, Issue 2, 2006
Volume 11, Issue 2, 2006
-
The design of a corpus of Contemporary Arabic
Author(s): Latifa Al-Sulaiti and Eric Steven Atwellpp.: 135–171 (37)More LessCorpora are an important resource for both teaching and research. Arabic lacks sufficient resources in this field, so a research project has been designed to compile a corpus, which represents the state of the Arabic language at the present time and the needs of end-users. This report presents the result of a survey of the needs of teachers of Arabic as a foreign language (TAFL) and language engineers. The survey shows that a wide range of text types should be included in the corpus. Overall, our survey confirms our view that existing corpora are too narrowly limited in source-type and genre, and that there is a need for a freely-accessible corpus of contemporary Arabic covering a broad range of text-types. We have collected and published an initial version of the Corpus of Contemporary Arabic (CCA) to meet these design issues. The CCA is freely downloadable via WWW from http://www.comp.leeds.ac.uk/arabic.
-
Evolution and present situation of corpus research in China
Author(s): Zhiwei Fengpp.: 173–207 (35)More LessIn this paper, the author introduces in detail the development and present situation of corpus linguistics in China: earlier corpora, large-scale & authentic text corpora, national corpora, speech corpora, bilingual corpora and corpora of minority languages in China. The various processing techniques for corpora are also introduced: automatic word segmentation of Chinese text, automatic PoS tagging, automatic tagging of phrase structure and automatic alignment of bilingual corpora. This paper is a bird’s-eye view of corpus linguistics of China. Finally, the author discusses several problems in present corpus research: standardization of corpus specifications, commonly sharing of language resources, knowledge properties, etc.
-
Discovering and organizing noun-verb collocations in specialized corpora using inductive logic programming
Author(s): Vincent Claveau and Marie-Claude L'Hommepp.: 209–243 (35)More LessThis article presents a method for discovering and organizing noun-verb (N-V) combinations found in a French corpus on computing. Our aim is to find N-V combinations in which verbs convey a “realization meaning” as defined in the framework of lexical functions (Mel’čuk 1996, 1998). Our approach, chiefly corpus-based, uses a machine learning technique, namely Inductive Logic Programming (ILP). The whole acquisition process is divided into three steps: (1) isolating contexts in which specific N-V pairs occur; (2) inferring linguistically-motivated rules that reflect the behaviour of realization N-V pairs; (3) projecting these rules on corpora to find other valid N-V pairs. This technique is evaluated in terms of the relevance of the rules inferred and in terms of the quality (recall and precision) of the results. Results obtained show that our approach is able to find these very specific semantic relationships (the realization N-V pairs) with very good success rates.
Volumes & issues
-
Volume 29 (2024)
-
Volume 28 (2023)
-
Volume 27 (2022)
-
Volume 26 (2021)
-
Volume 25 (2020)
-
Volume 24 (2019)
-
Volume 23 (2018)
-
Volume 22 (2017)
-
Volume 21 (2016)
-
Volume 20 (2015)
-
Volume 19 (2014)
-
Volume 18 (2013)
-
Volume 17 (2012)
-
Volume 16 (2011)
-
Volume 15 (2010)
-
Volume 14 (2009)
-
Volume 13 (2008)
-
Volume 12 (2007)
-
Volume 11 (2006)
-
Volume 10 (2005)
-
Volume 9 (2004)
-
Volume 8 (2003)
-
Volume 7 (2002)
-
Volume 6 (2001)
-
Volume 5 (2000)
-
Volume 4 (1999)
-
Volume 3 (1998)
-
Volume 2 (1997)
-
Volume 1 (1996)
Most Read This Month
Article
content/journals/15699811
Journal
10
5
false

-
-
The Spoken BNC2014
Author(s): Robbie Love, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery
-
- More Less