Full text loading...
-
The design of a corpus of Contemporary Arabic
- Source: International Journal of Corpus Linguistics, Volume 11, Issue 2, Jan 2006, p. 135 - 171
- Previous Article
- Table of Contents
- Next Article
Abstract
Corpora are an important resource for both teaching and research. Arabic lacks sufficient resources in this field, so a research project has been designed to compile a corpus, which represents the state of the Arabic language at the present time and the needs of end-users. This report presents the result of a survey of the needs of teachers of Arabic as a foreign language (TAFL) and language engineers. The survey shows that a wide range of text types should be included in the corpus. Overall, our survey confirms our view that existing corpora are too narrowly limited in source-type and genre, and that there is a need for a freely-accessible corpus of contemporary Arabic covering a broad range of text-types. We have collected and published an initial version of the Corpus of Contemporary Arabic (CCA) to meet these design issues. The CCA is freely downloadable via WWW from http://www.comp.leeds.ac.uk/arabic.