Volume 6, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes


An adult language corpus of spoken Hong Kong Cantonese (HKCAC) has recently been developed consisting of spontaneous speech recorded from phone-in programs and forums on the radio in Hong Kong. The database represents the speech of a total of sixty-nine speakers in addition to the program hosts, and has approximately 170,000 characters. It is believed that HKCAC will be of great value to linguists who are interested in studying Cantonese, and speech therapists and educators who work with the Cantonese speaking population. A search engine with a user-friendly interface has also been developed by using FileMaker Pro 4.0 (Chinese version). Apart from the basic frequency information and the display of search results in KWAL (Key Word And Line) format, the search engine also allows users to search for various phonetic realizations of a particular character or the set of characters associated with a particular syllable. The content and structure of the corpus, and the overall architecture as well as the technical aspects of the search engine are described. Search procedures are illustrated with examples. The paper ends with a discussion of the future development of HKCAC.


Article metrics loading...

Loading full text...

Full text loading...

  • Article Type: Research Article
Keyword(s): computerized corpus; Hong Kong Cantonese; search engine; spoken corpus
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error