1887
Volume 6, Issue 1
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
USD
Buy:$35.00 + Taxes

Abstract

In this article, we present and evaluate a method for training a statistical part-of-speech tagger on data from written language and then adapting it to the requirements of tagging a corpus of transcribed spoken language, in our case spoken Swedish. This is currently a significant problem for many research groups working with spoken language, since the availability of tagged training data from spoken language is still very limited for most languages. The overall accuracy of the tagger developed for spoken Swedish is quite respectable, varying from 95% to 97% depending on the tagset used. In conclusion, we argue that the method presented here gives good tagging accuracy with relatively little effort.

Loading

Article metrics loading...

/content/journals/10.1075/ijcl.6.1.03niv
2001-01-01
2025-01-17
Loading full text...

Full text loading...

/content/journals/10.1075/ijcl.6.1.03niv
Loading
  • Article Type: Research Article
Keyword(s): spoken language corpora; statistical part-of-speech tagging
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error