1887
image of Training of English prosody with acoustically modified voices

Abstract

Abstract

Prosodic aspects of speech are crucial for the comprehensibility of L2 speakers, but prosody is rarely targeted in English language lessons. This paper describes an innovative training of English phrasal prosody using participants’ own speech as models in a modified listen-and-repeat paradigm, with their melodic and rhythmic patterns manipulated by means of PSOLA. The two-hour training was delivered individually to twelve intermediate native speakers of Czech. The comparison of a baseline recording of a read text before the training and two texts read six weeks after the training shows that using one’s own PSOLA-modified voice for prosody training is beneficial: the participants were perceived as sounding significantly more competent in the after-training recordings, their phrasing corresponded more to text-based predictions, and their melodic variability was significantly greater. The contribution of targeted prosody modifications in the teaching of L2 pronunciation are discussed.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/jslp.24041.ska
2025-02-03
2025-02-11
Loading full text...

Full text loading...

/deliver/fulltext/10.1075/jslp.24041.ska/jslp.24041.ska.html?itemId=/content/journals/10.1075/jslp.24041.ska&mimeType=html&fmt=ahah

References

  1. Anderson-Hsieh, J.
    (1990) Teaching suprasegmentals to international teaching assistants using field-specific materials. English for Specific Purposes, , –. 10.1016/0889‑4906(90)90013‑3
    https://doi.org/10.1016/0889-4906(90)90013-3 [Google Scholar]
  2. Baker, A. A.
    (2011) Discourse prosody and teachers’ stated beliefs and practices. TESOL Journal, , –. 10.5054/tj.2011.259955
    https://doi.org/10.5054/tj.2011.259955 [Google Scholar]
  3. Beckman, M. E., & Ayers Elam, G.
    (1997) Guidelines for ToBI labelling, version 3. The Ohio State University Research Foundation.
    [Google Scholar]
  4. Bergeron, A., & Trofimovich, P.
    (2017) Linguistic dimensions of accentedness and comprehensibility: Exploring task and listener effects in second language French. Foreign Language Annals, (), –. 10.1111/flan.12285
    https://doi.org/10.1111/flan.12285 [Google Scholar]
  5. Boersma, P., & Weenink, D.
    (2024) Praat: Doing phonetics by computer (Version 6.4). Retrieved fromwww.praat.org.
    [Google Scholar]
  6. Bořil, T., & Skarnitzl, R.
    (2016) Tools rPraat and mPraat: Interfacing phonetic analyses with signal processing. In: P. Sojka, A. Horák, I. Kopeček & K. Pala (Eds.), Proceedings of the 19th International Conference on Text, Speech and Dialogue (pp.–). Springer International Publishing. 10.1007/978‑3‑319‑45510‑5_42
    https://doi.org/10.1007/978-3-319-45510-5_42 [Google Scholar]
  7. Chafe, W. L.
    (1988) Linking intonation units in spoken English. In: J. Haiman & S. A. Thompson (Eds.), Clause combining in grammar and discourse (pp.–). John Benjamins. 10.1075/tsl.18.03cha
    https://doi.org/10.1075/tsl.18.03cha [Google Scholar]
  8. Chun, D. M., & Levis, J. M.
    (2020) Prosody in L2 teaching: Methodologies and effectiveness. In: C. Gussenhoven & A. Chen (Eds.), Oxford handbook of language prosody (pp.–). Oxford University Press.
    [Google Scholar]
  9. Croft, W.
    (1995) Intonation units and grammatical structure. Linguistics, , –. 10.1515/ling.1995.33.5.839
    https://doi.org/10.1515/ling.1995.33.5.839 [Google Scholar]
  10. Crowther, D., Trofimovich, P., & Isaacs, T.
    (2016) Linguistic dimensions of second language accent and comprehensibility: Nonnative listeners’ perspectives. Journal of Second Language Pronunciation, (), –. 10.1075/jslp.2.2.02cro
    https://doi.org/10.1075/jslp.2.2.02cro [Google Scholar]
  11. Cutler, A.
    (2012) Native listening. MIT Press. 10.7551/mitpress/9012.001.0001
    https://doi.org/10.7551/mitpress/9012.001.0001 [Google Scholar]
  12. Dankovičová, J., & Dellwo, V.
    (1999) Czech speech rhythm and the rhythm class hypothesis. In: J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. C. Bailey (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (pp.–, San Francisco, August1–7 1999.
    [Google Scholar]
  13. De Meo, A., Vitale, M., Pattorino, M., Cutugno, F., & Origlia, A.
    (2013) Imitation/self-imitation in computer-assisted prosody training for Chinese learners of L2 Italian. In: J. Levis, & K. LeVelle (Eds.), Proceedings of the 4th Pronunciation in Second Language Learning and Teaching conference (pp.–). Iowa State University.
    [Google Scholar]
  14. Derwing, T. M., Levis, J. M., Sonsaat-Hegeheimer, S.
    (2022) Bridging the research-practice gap in L2 pronunciation. In: J. M. Levis, T. M. Derwing, & S. Sonsaat-Hegelheimer (Eds.), Second language pronunciation: Bridging the gap between research and teaching (pp.–). Wiley Blackwell. 10.1002/9781394259663.ch1
    https://doi.org/10.1002/9781394259663.ch1 [Google Scholar]
  15. Derwing, T. M., Munro, M. J., Foote, J. A., Waugh, E., & Fleming, F.
    (2014) Opening the window on comprehensible pronunciation after 19 years: A workplace training study. Language Learning, (), –. 10.1111/lang.12053
    https://doi.org/10.1111/lang.12053 [Google Scholar]
  16. Dickerson, W.
    (2019) The ripples of rhythm: Implications for ESL instruction. J. Levis, C. Nagle & E. Todey (Eds.), Proceedings of the 10th Pronunciation in Second Language Learning and Teaching conference (pp.–). Iowa State University, September 2018.
    [Google Scholar]
  17. Ding, S., Liberatore, C., Sonsaat, S., Lučić, I., Silpachai, A., Zhao, G., Chukharev-Hudilainen, E., Levis J., & Gutierrez-Osuna, R.
    (2019) Golden speaker builder — An interactive tool for pronunciation training. Speech Communication, 115, –. 10.1016/j.specom.2019.10.005
    https://doi.org/10.1016/j.specom.2019.10.005 [Google Scholar]
  18. Eriksson, A. & Heldner, M.
    (2015) The acoustics of word stress in English as a function of stress level and speaking style. In: S. Möller, H. Ney, B. Möbius, E. Nöth & S. Steidl (Eds.), Proceedings of Interspeech 2015 (pp.–). Dresden, September6–10 2015 10.21437/Interspeech.2015‑9
    https://doi.org/10.21437/Interspeech.2015-9 [Google Scholar]
  19. Felps, D., Bortfeld, H., & Gutierrez-Osuna, R.
    (2009) Foreign accent conversion in computer assisted pronunciation training. Speech Communication, 51, –. 10.1016/j.specom.2008.11.004
    https://doi.org/10.1016/j.specom.2008.11.004 [Google Scholar]
  20. Frazier, L., Carlson, K., & Clifton, C. Jr.
    (2006) Prosodic phrasing is central to language comprehension. Trends in Cognitive Sciences, (), –. 10.1016/j.tics.2006.04.002
    https://doi.org/10.1016/j.tics.2006.04.002 [Google Scholar]
  21. French, L. M., Gagné, N., & Collins, L.
    (2020) Long-term effects of intensive instruction on fluency, comprehensibility and accentedness. Journal of Second Language Pronunciation, (), –. 10.1075/jslp.20026.fre
    https://doi.org/10.1075/jslp.20026.fre [Google Scholar]
  22. Gordon, J., & Darcy, I.
    (2016) The development of comprehensible speech in L2 learners. Journal of Second Language Pronunciation, , –. 10.1075/jslp.2.1.03gor
    https://doi.org/10.1075/jslp.2.1.03gor [Google Scholar]
  23. (2022) Teaching segmentals and suprasegmentals: Effects of explicit pronunciation instruction on comprehensibility, fluency, and accentedness. Journal of Second Language Pronunciation, (), –. 10.1075/jslp.21042.gor
    https://doi.org/10.1075/jslp.21042.gor [Google Scholar]
  24. Gravano, A., & Hirschberg, J.
    (2011) Turn-taking cues in task-oriented dialogue. Computer Speech and Language, 25, –. 10.1016/j.csl.2010.10.003
    https://doi.org/10.1016/j.csl.2010.10.003 [Google Scholar]
  25. Henderson, A. J., & Skarnitzl, R.
    (2022) “A better me”: Using acoustically modified learner voices as models. Language Learning & Technology, (), –. hdl.handle.net/10125/73462
    [Google Scholar]
  26. Hermes, D.
    (2006) Stylization of pitch contours. In: S. Sudhoff (Eds.), Methods in empirical prosody research (pp.–). De Gruyter. 10.1515/9783110914641.29
    https://doi.org/10.1515/9783110914641.29 [Google Scholar]
  27. Hickok, G.
    (2010) The role of mirror neurons in speech perception and action word semantics. Language and Cognitive Processes, (), –. 10.1080/01690961003595572
    https://doi.org/10.1080/01690961003595572 [Google Scholar]
  28. Hruška, R., & Bořil, T.
    (2017) Temporal variability of fundamental frequency contours. Acta Universitatis Carolinae — Philologica, 3, –. 10.14712/24646830.2017.31
    https://doi.org/10.14712/24646830.2017.31 [Google Scholar]
  29. Kolly, M. -J., Boula de Mareüil, P., Leemann, A., & Dellwo, V.
    (2017) Listeners use temporal information to identify French- and English-accented speech. Speech Communication, 86, –. 10.1016/j.specom.2016.11.006
    https://doi.org/10.1016/j.specom.2016.11.006 [Google Scholar]
  30. Kügler, F., & Calhoun, S.
    (2020) Prosodic encoding of information structure: A typological perspective. In: C. Gussenhoven & A. Chen (Eds.), Oxford handbook of language prosody (pp.–). Oxford University Press. 10.1093/oxfordhb/9780198832232.013.30
    https://doi.org/10.1093/oxfordhb/9780198832232.013.30 [Google Scholar]
  31. Kuhn, M.
    (2008) Building predictive models in R using the caret package. Journal of Statistical Software, (), –. 10.18637/jss.v028.i05
    https://doi.org/10.18637/jss.v028.i05 [Google Scholar]
  32. Kusz, E.
    (2023) Effects of self-imitation practice on L2 pronunciation with the use of Golden Speaker Builder. In: R. I. Thomson, T. M. Derwing, J. M. Levis & K. Hiebert (Eds.), Proceedings of the 13th Pronunciation in Second Language Learning and Teaching conference, Brock University, June 2022 10.31274/psllt.15721
    https://doi.org/10.31274/psllt.15721 [Google Scholar]
  33. Lambert, W. E., Hodgson, R. C., Gardner, R. C., & Fillenbaum, S.
    (1960) Evaluational reactions to spoken languages. Journal of Abnormal and Social Psychology, (), –. 10.1037/h0044430
    https://doi.org/10.1037/h0044430 [Google Scholar]
  34. Lenth, R. V.
    (2023) emmeans: Estimated Marginal Means, aka Least-Squares Means, v. 1.8.8. Retrieved fromhttps://CRAN.R-project.org/package=emmeans
    [Google Scholar]
  35. Levis, J. M.
    (2018) Intelligibility, oral communication, and the teaching of pronunciation. Cambridge University Press. 10.1017/9781108241564
    https://doi.org/10.1017/9781108241564 [Google Scholar]
  36. Li, P., Baills, F., Alazard-Guiu, C., Baqué, L., & Prieto, P.
    (2023) A pedagogical note on teaching L2 prosody and speech sounds using hand gestures. Journal of Second Language Pronunciation, (), –. 10.1075/jslp.23043.li
    https://doi.org/10.1075/jslp.23043.li [Google Scholar]
  37. Mennen, I., & de Leeuw, E.
    (2014) Beyond segments: Prosody in SLA. Studies in Second Language Acquisition, 36, –. 10.1017/S0272263114000138
    https://doi.org/10.1017/S0272263114000138 [Google Scholar]
  38. Moulines, E., & Charpentier, F.
    (1990) Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Communication, (), –. 10.1016/0167‑6393(90)90021‑Z
    https://doi.org/10.1016/0167-6393(90)90021-Z [Google Scholar]
  39. Munro, M. J., & Derwing, T. M.
    (1999) Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, , –. 10.1111/0023‑8333.49.s1.8
    https://doi.org/10.1111/0023-8333.49.s1.8 [Google Scholar]
  40. Niebuhr, O., Alm, M., Schümchen, N., & Fischer, K.
    (2017) Comparing visualization techniques for learning second language prosody: First results. International Journal of Learner Corpus Research, (), –. 10.1075/ijlcr.3.2.07nie
    https://doi.org/10.1075/ijlcr.3.2.07nie [Google Scholar]
  41. O’Brien, M. G.
    (2022) Making the teaching of suprasegmentals accessible. In: J. M. Levis, T. M. Derwing, & S. Sonsaat-Hegelheimer (Eds.), Second language pronunciation: Bridging the gap between research and teaching (pp.–). Wiley Blackwell. 10.1002/9781394259663.ch5
    https://doi.org/10.1002/9781394259663.ch5 [Google Scholar]
  42. Phillips, S., Aguilar Perez, A., Alt, H., & Darcy, I.
    (2022) Pause for thought (groups): non-native pausing behavior and ease of processing of L2 speech. In: J. Levis & A. Guskaroska (Eds.), Proceedings of the 12th Pronunciation in Second Language Learning and Teaching conference, Brock University, June 2021 10.31274/psllt.13355
    https://doi.org/10.31274/psllt.13355 [Google Scholar]
  43. Pickering, L.
    (2001) The role of tone choice in improving ITA communication in the classroom. TESOL Quarterly, (), –. 10.2307/3587647
    https://doi.org/10.2307/3587647 [Google Scholar]
  44. Polyanskaya, L., Ordin, M., & Busa, M. G.
    (2017) Relative salience of speech rhythm and speech rate on perceived foreign accent in a second language. Language and Speech, (), –. 10.1177/0023830916648720
    https://doi.org/10.1177/0023830916648720 [Google Scholar]
  45. Rogerson-Revell, P.
    (2012) Can or should we teach intonation?IATEFL Pronunciation SIG Newsletter, , –.
    [Google Scholar]
  46. Saito, Y., & Saito, K.
    (2017) Differential effects of instruction on the development of second language comprehensibility, word stress, rhythm, and intonation: The case of inexperienced Japanese EFL learners. Language Teaching Research, (), –. 10.1177/1362168816643111
    https://doi.org/10.1177/1362168816643111 [Google Scholar]
  47. Saito, K., Trofimovich, P., & Isaacs, T.
    (2016) Second language speech production: Investigating linguistic correlates of comprehensibility and accentedness for learners at different ability levels. Applied Psycholinguistics, 37, –. 10.1017/S0142716414000502
    https://doi.org/10.1017/S0142716414000502 [Google Scholar]
  48. Scherer, K. R.
    (2003) Vocal communication of emotion: A review of research paradigms. Speech Communication, 40, –. 10.1016/S0167‑6393(02)00084‑5
    https://doi.org/10.1016/S0167-6393(02)00084-5 [Google Scholar]
  49. Skarnitzl, R., & Eriksson, A.
    (2017) The acoustics of word stress in Czech as a function of speaking style. In: F. Lacerda, D. House, M. Heldner, J. Gustafson, S. Strömbergsson & M. Włodarczak (Eds.), Proceedings of Interspeech 2017 (pp.–, Stockholm, August20–24 2017 10.21437/Interspeech.2017‑417
    https://doi.org/10.21437/Interspeech.2017-417 [Google Scholar]
  50. Skarnitzl, R., & Hledíková, H.
    (2022) Prosodic phrasing of good speakers in English and Czech. Frontiers in Psychology, , . 10.3389/fpsyg.2022.857647
    https://doi.org/10.3389/fpsyg.2022.857647 [Google Scholar]
  51. Stoffel, M. A., Nakagawa, S., & Schielzeth, H.
    (2017) rptR: repeatability estimation and variance decomposition by generalized linear mixed-effects models. Methods in Ecology and Evolution, (), –. 10.1111/2041‑210X.12797
    https://doi.org/10.1111/2041-210X.12797 [Google Scholar]
  52. Sundström, A.
    (1998) Automatic prosody modification as a means for foreign language pronunciation training. In: Proceedings of ETRW on Speech Technology in Language Learning (STiLL) (pp.–, Marholmen, May25–27 1998 Retrieved fromhttps://www.isca-archive.org/still_1998/sundstrom98_still.html
    [Google Scholar]
  53. Šturm, P., & Lukeš, D.
    (2017) Fonotaktická analýza obsahu slabik na okrajích českých slov v mluvené a psané řeči [A phonotactic analysis of the content of syllables on word boundaries in spoken and written Czech texts]. Slovo a slovesnost, (), –.
    [Google Scholar]
  54. Trouvain, J., & Braun, B.
    (2020) Sentence prosody in a second language. In: C. Gussenhoven & A. Chen (Eds.), Oxford handbook of language prosody (pp.–). Oxford University Press. 10.1093/oxfordhb/9780198832232.013.40
    https://doi.org/10.1093/oxfordhb/9780198832232.013.40 [Google Scholar]
  55. Van Maastricht, L., Zee, T., Krahmer, E., & Swerts, M.
    (2021) The interplay of prosodic cues in the L2: How intonation, rhythm, and speech rate in speech by Spanish learners of Dutch contribute to L1 Dutch perceptions of accentedness and comprehensibility. Speech Communication, , –. 10.1016/j.specom.2020.04.003
    https://doi.org/10.1016/j.specom.2020.04.003 [Google Scholar]
  56. Volín, J.
    (2019) The size of prosodic phrases in native and foreign-accented read-out monologues. Acta Universitatis Carolinae — Philologica/2019, –. 10.14712/24646830.2019.23
    https://doi.org/10.14712/24646830.2019.23 [Google Scholar]
  57. Volín, J., & Poesová, K.
    (2016) Perceptual impact of speech melody hybridization: English and Czech English. Research in Language, (), –. 10.1515/rela‑2016‑0006
    https://doi.org/10.1515/rela-2016-0006 [Google Scholar]
  58. Volín, J., Poesová, K., & Weingartová, L.
    (2015) Speech melody properties in English, Czech and Czech English: Reference and interference. Research in Language, , –. 10.1515/rela‑2015‑0018
    https://doi.org/10.1515/rela-2015-0018 [Google Scholar]
  59. Wickham, H.
    (2016) ggplot2: Elegant graphics for data analysis. Springer-Verlag. Available at: https://ggplot2.tidyverse.org/. 10.1007/978‑3‑319‑24277‑4
    https://doi.org/10.1007/978-3-319-24277-4 [Google Scholar]
/content/journals/10.1075/jslp.24041.ska
Loading
/content/journals/10.1075/jslp.24041.ska
Loading

Data & Media loading...

  • Article Type: Research Article
Keywords: intonation ; English as a foreign language ; PSOLA manipulation ; visualization ; prosody
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error