Automatic suprasegmental parameter extraction in learner corpora
In this chapter, an attempt is made to compute automatically suprasegmental – and, in particular, rhythmic – parameters that could be used to distinguish between a group of French learners of English and a group of native speakers. As a preliminary step, an automatic segmentation algorithm is benchmarked against manual segmentation. This assessment then leads us to reject the classic duration-based rhythm metrics and adopt alternative measurements involving pitch and intensity. Finally, we use an automatic classifier to check to what extent our metrics predict a reliable boundary between learner and native speech.