Volume 4, Issue 1
  • ISSN 2215-1931
  • E-ISSN: 2215-194X
Buy:$35.00 + Taxes


Language learning is a multimodal endeavor; to improve their pronunciation in a new language, learners access not only auditory information about speech sounds and patterns, but also visual information about articulatory movements and processes. With the development of new technologies in computer-assisted pronunciation training (CAPT) comes new possibilities for delivering feedback in both auditory and visual modalities. The present paper surveys the literature on computer-assisted visual articulation feedback, including direct feedback that provides visual models of articulation and indirect feedback that uses visualized acoustic information as a means to inform articulation instruction. Our focus is explicitly on segmental features rather than suprasegmental ones, with visual feedback conceived of as providing visualizations of articulatory configurations, movements, and processes. In addition to discussing types of visual articulation feedback, we also consider the criteria for effective delivery of feedback, and methods of evaluation.


Article metrics loading...

Loading full text...

Full text loading...


  1. Abberton, E. , & Fourcin, A. J.
    (1975) Visual feedback and the acquisition of intonation. In E. H. Lenneberg & E. Lenneberg (Eds.). Foundations of language development: A multidisciplinary approach (Vol.2, pp.157–165). Paris: UNESCO.
    [Google Scholar]
  2. Akahane-Yamada, R. , McDermott, E. , Adaichi, T. , Kawahara, H. , & Pruitt, J. S.
    (1998) Computer-based second language production training by using spectrographic representation and HMM-based speech recognition scores. Paper presented atthe 1998 International Conference on Spoken Language Processing, Sydney, Australia. Retrieved from www.mirlab.org/conference_papers/International_​Conference/ICSLP%201998/PDF/AUTHOR/SL980429.PDF (16 December, 2015).
    [Google Scholar]
  3. Aliaga-García, C. , & Mora, J. C.
    (2009) Assessing the effects of phonetic training on L2 sound perception and production. In M. A. Watkins , A. S. Rauber , & B. O. Baptista (Eds.), Recent research in second language phonetics/phonology: Perception and production (pp.2–31). Newcastle upon Tyne: Cambridge Scholars.
    [Google Scholar]
  4. Anderson, F.
    (1960) An experimental pitch indicator for training deaf scholars. Journal of the Acoustical Society of America, 32(8), 1065–1074. doi: 10.1121/1.1908313
    https://doi.org/10.1121/1.1908313 [Google Scholar]
  5. Badin, P. , Ben Youssef, A. , Bailly, G. , Elisei, F. , & Hueber, T.
    (2010) Visual articulatory feedback for phonetic correction in second language learning. Proceedings of the Workshop on Second Language Studies: Acquisition, Learning, Education, and Technology, 1–10.
    [Google Scholar]
  6. Ballard, K. J. , Smith, H. D. , Paramatmuni, D. , McCabe, P. , Theodoros, D. G. , & Murdoch, B. E.
    (2012) Amount of kinematic feedback affects learning of speech motor skills. Motor Control, 16, 106–119.10.1123/mcj.16.1.106
    https://doi.org/10.1123/mcj.16.1.106 [Google Scholar]
  7. Bernhardt, B. , Gick, B. , Bacsfalvi, P. , & Ashdown, J.
    (2003) Speech habilitation of hard of hearing adolescents using electropalatography and ultrasound as evaluated by trained listeners. Clinical Linguistics & Phonetics, 17(3), 199–216. doi: 10.1080/0269920031000071451
    https://doi.org/10.1080/0269920031000071451 [Google Scholar]
  8. Bernhardt, B. , Gick, B. , Bacsfalvi, P. , & Adler-Bock, M.
    (2005) Ultrasound in speech therapy with adolescents and adults. Clinical Linguistics & Phonetics, 19(6/7), 605–617. doi: 10.1080/02699200500114028
    https://doi.org/10.1080/02699200500114028 [Google Scholar]
  9. Boersma, P.
    (2001) Praat. A system for doing phonetics by computer. Glot International, 5(9/10), 341–345.
    [Google Scholar]
  10. Bruce, C. J. , Spittell, P. C. , Montgomery, S. C. , Bailey, K. R. , Tajik, A. J. , & Seward, J. B.
    (2000) Ultrasound imager: Abdominal aortic aneurysm screening. Journal of the American Society of Echocardiography, 13, 674–679. doi: 10.1067/mje.2000.107797
    https://doi.org/10.1067/mje.2000.107797 [Google Scholar]
  11. Carey, M.
    (2004) CALL visual feedback for pronunciation of vowels: Kay Sona-Match. CALICO Journal, 21(3), 571–601.10.1558/cj.v21i3.571‑601
    https://doi.org/10.1558/cj.v21i3.571-601 [Google Scholar]
  12. Catford, J. C. & Pisoni, D. B.
    (1970) Auditory versus articulatory training in exotic sounds. The Modern Language Journal, 54(7), 477–481.
    [Google Scholar]
  13. Chun, D. M.
    (1989) Teaching tone and intonation with microcomputers. CALICO Journal, 7(1), 21–46.
    [Google Scholar]
  14. (1998) Signal analysis software for teaching discourse intonation. Language Learning & Technology, 2(1), 61–77.
    [Google Scholar]
  15. (2002) Discourse intonation in L2: From theory and research to practice (Language Learning and Teaching 1). Amsterdam: John Benjamins. doi: 10.1075/lllt.1
    https://doi.org/10.1075/lllt.1 [Google Scholar]
  16. (2013) Computer-assisted pronunciation teaching. In C. Chapelle (ed). The encyclopedia of applied linguistics. Oxford: Blackwell.
    [Google Scholar]
  17. Clarius
    Clarius (2016) Wireless, handheld ultrasound for iOS and Android debuts. [Press release]. Retrieved from https://www.clarius.me/aium-debut-pr/.
  18. Cleland, J. , Scobbie, J. M. , Nakai, S. , & Wrench, A.
    (2015) Helping children learn non-native articulations: the implications for ultrasound-based clinical intervention. Paper presented atthe 2015 International Conference of Phonetic Sciences, Glasgow, Scotland. Retrieved from www.icphs2015.info/pdfs/Papers/ICPHS0698.pdf (12 August, 2015).
    [Google Scholar]
  19. Cook, V.
    (Ed.) (1986) Experimental approaches to second language learning. Oxford: Pergamon.
    [Google Scholar]
  20. de Bot, C. L. J.
    (1980) The role of feedback and feedforward in the teaching of pronunciation. System, 8, 35–45. doi: 10.1016/0346‑251X(80)90022‑6
    https://doi.org/10.1016/0346-251X(80)90022-6 [Google Scholar]
  21. Demenko, G. , Wagner, A. , & Cylwik, N.
    (2010) The use of speech technology in foreign language pronunciation training. Archives of Acoustics, 35(3), 309–329. doi: 10.2478/v10168‑010‑0027‑z
    https://doi.org/10.2478/v10168-010-0027-z [Google Scholar]
  22. Dowd, A. , Smith, J. , & Wolfe, J.
    (1997) Learning to pronounce vowel sounds in a foreign language using acoustic measurements of the vocal tract as feedback in real time. Language and Speech, 41(1), 1–20.10.1177/002383099804100101
    https://doi.org/10.1177/002383099804100101 [Google Scholar]
  23. Engwall, O.
    (2012) Analysis of and feedback on phonetic features in pronunciation training with a virtual teacher. Computer Assisted Language Learning, 25(1), 37–64. doi: 10.1080/09588221.2011.582845
    https://doi.org/10.1080/09588221.2011.582845 [Google Scholar]
  24. Gick, B. , Bernhardt, B. , Bacsfalvi, P. , & Wilson, I.
    (2008) Ultrasound imaging applications in second language acquisition. In J. G. Hansen Edwards & M. L. Zampini (Eds.), Phonology and second language acquisition (pp.309–322). Amsterdam: John Benjamins. doi: 10.1075/sibil.36.15gic
    https://doi.org/10.1075/sibil.36.15gic [Google Scholar]
  25. Hardison, D. M.
    (2004) Generalization of computer-assisted prosody training: Quantitative and qualitative findings. Language Learning & Technology, 8, 34–52.
    [Google Scholar]
  26. Hincks, R.
    (2015) Technology and learning pronunciation. In M. Reed & J. M. Levis (Eds.), The handbook of English pronunciation (pp.505–519). Hoboken, NJ: Wiley and Sons.
    [Google Scholar]
  27. Jenson, P. G. , & Westermeier, F. X.
    (1968) The effect of visual feedback on pronunciation in foreign language learning. Retrieved from files.eric.ed.gov/fulltext/ED015689.pdf (29 August, 2015).
  28. Kalikow, D. N. , & Swets, J. A.
    (1972) Experiments with computer-controlled displays in second-language learning. IEEE Transactions on Audio and Electroacoustics, AU-20(1), 23–28. doi: 10.1109/TAU.1972.1162353
    https://doi.org/10.1109/TAU.1972.1162353 [Google Scholar]
  29. Kartushina, N. , Hervais-Adelman, A. , Frauenfelder, U. H. , & Golestani, N.
    (2015) The effect of phonetic production training with visual feedback on the perception and production of foreign speech sounds. Journal of the Acoustical Society of America, 138(2), 817–832. doi: 10.1121/1.4926561
    https://doi.org/10.1121/1.4926561 [Google Scholar]
  30. Katz, W. , Campbell, T. , Wang, J. , Farrar, E. , Eubanks, J. C. , Balasubramanian, A. , Prabhakaran, B. , & Rennaker, R.
    (2014) Opti-Speech: A real-time, 3D visual feedback system for speech training. InProceedings of Interspeech 2014, Singapore (pp.1174–1178). Retrieved from https://www.utdallas.edu/~wangjun/paper/Interspeech14_opti-speech.pdf (22 January, 2016).
    [Google Scholar]
  31. Katz, W. F. , & Mehta, S.
    (2015) Visual feedback of tongue movement for novel speech sound learning. Frontiers in Human Neuroscience, 9, 612. doi: 10.3389/fnhum.2015.00612.
    https://doi.org/10.3389/fnhum.2015.00612 [Google Scholar]
  32. Kelsey, C. A. , Minifie, F. D. , & Hixon, T. J.
    (1969) Applications of ultrasound in speech research. Journal of Speech, Language, and Hearing Research, 12(3), 564–575 doi: 10.1044/jshr.1203.564
    https://doi.org/10.1044/jshr.1203.564 [Google Scholar]
  33. Lambacher, S.
    (1999) A CALL tool for improving second language acquisition of English consonants by Japanese learners. Computer Assisted Language Learning, 12(2), 137–156. doi: 10.1076/call.
    https://doi.org/10.1076/call. [Google Scholar]
  34. Lee, J. , Jang, J. , & Plonksy, L.
    (2015) The effectiveness of second language pronunciation instruction: A meta-analysis. Applied Linguistics, 36(3), 345–355. doi: 10.1093/applin/amu040
    https://doi.org/10.1093/applin/amu040 [Google Scholar]
  35. Léon, P. R. , & Martin, P.
    (1972) Applied linguistics and the teaching of intonation. The Modern Language Journal, 56(3), 139–144. doi: 10.1111/j.1540‑4781.1972.tb05032.x
    https://doi.org/10.1111/j.1540-4781.1972.tb05032.x [Google Scholar]
  36. Levis, J. M. & Pickering, L.
    (2004) Teaching intonation in discourse using speech visualization technology. System, 32(4), 505–524. doi: 10.1016/j.system.2004.09.009
    https://doi.org/10.1016/j.system.2004.09.009 [Google Scholar]
  37. Levitt, J. S. , & Katz, W. F.
    (2007) Augmented visual feedback in second language learning: training Japanese post-alveolar flaps to American English speakers. Journal of the Acoustical Society of America, 122(5), 2996. doi: 10.1121/1.2942697
    https://doi.org/10.1121/1.2942697 [Google Scholar]
  38. Massaro, D. W. , & Light, J.
    (2003) Read my tongue movements: bimodal learning to perceive and produce non-native speech /r/ and /l/. Proceedings of the 8th European Conference on Speech Communication and Technology.
    [Google Scholar]
  39. Mattheyses, W. & Verhelst, W.
    (2015) Audiovisual speech synthesis: An overview of the state- of-the-art. Speech Communication, 66, 182–217. doi: 10.1016/j.specom.2014.11.001
    https://doi.org/10.1016/j.specom.2014.11.001 [Google Scholar]
  40. Moisik, S. R. , Esling, J. H. , Bird, S. , & Lin, H.
    (2011) Evaluating laryngeal ultrasound to study larynx state and height. In W. S. Lee & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences Hong Kong (pp.136–139).
    [Google Scholar]
  41. Molholt, G.
    (1988) Computer-assisted instruction in pronunciation for Chinese speakers of American English. TESOL Quarterly, 22(1), 91–111. doi: 10.2307/3587063
    https://doi.org/10.2307/3587063 [Google Scholar]
  42. (1990) Spectrographic analysis and patterns in pronunciation. Computers and the Humanities, 24(1/2), 81–92. doi: 10.1007/BF00115030
    https://doi.org/10.1007/BF00115030 [Google Scholar]
  43. Navarra, J. , & Soto-Faraco, S.
    (2007) Hearing lips in a second language: Visual articulatory information enables the perception of second language sounds. Psychological Research71, 4–12. doi: 10.1007/s00426‑005‑0031‑5
    https://doi.org/10.1007/s00426-005-0031-5 [Google Scholar]
  44. Neri, A. , Cucchiarini, C. , Strik, H. , & Boves, L.
    (2002) The pedagogy-technology interface in computer-assisted pronunciation training. Computer-Assisted Language Learning, 21(5), 393–408. doi: 10.1080/09588220802447651
    https://doi.org/10.1080/09588220802447651 [Google Scholar]
  45. Noguchi, M. , Yamane, N. , Tsuda, A. , Kazama, M. , Kim, B. , & Gick, B.
    (2015) Towards protocols for L2 pronunciation training using ultrasound imaging. Poster presentation at the 7th annual Pronunciation in Second Language Learning and Teaching (PSLLT) Conference. Dallas, TX, October 2015.
  46. Olson, D. J.
    (2014a) Phonetics and technology in the classroom: A practical approach to using speech analysis software in second-language pronunciation instruction. Hispania, 97(1), 47–68. doi: 10.1353/hpn.2014.0030
    https://doi.org/10.1353/hpn.2014.0030 [Google Scholar]
  47. (2014b) Benefits of visual feedback on segmental production in the L2 classroom. Language Learning and Technology, 18(3), 173–192.
    [Google Scholar]
  48. Öster, A. -M.
    (1997) Auditory and visual feedback in spoken L2 teaching. Reports from the Department of Phonetics, Umeå University (PHONUM), 4, 145–148.
    [Google Scholar]
  49. Ouni, S.
    (2014) Tongue control and its implication in pronunciation training. Computer Assisted Language Learning, 27(5), 439–453. doi: 10.1080/09588221.2012.761637
    https://doi.org/10.1080/09588221.2012.761637 [Google Scholar]
  50. Patten, I. , & Edmonds, L. A.
    (2015) Effect of training Japanese L1 speakers in the production of American English /r/ using spectrographic visual feedback. Computer Assisted Language Learning, 28(3), 241–259. doi: 10.1080/09588221.2013.839570
    https://doi.org/10.1080/09588221.2013.839570 [Google Scholar]
  51. Pillot-Loiseau, C. , Kamiyama, T. , & Kocjančič Antolík, T.
    (2015) French /y/-/u/ contrast in Japanese learners with/without ultrasound feedback: vowels, non-words and words. Paper presented atthe 2015 International Conference of Phonetic Sciences, Glasgow, Scotland. Retrieved on from www.icphs2015.info/pdfs/Papers/ICPHS0485.pdf (12 August, 2015).
    [Google Scholar]
  52. Quintana-Lara, M.
    (2014) Effect of acoustic spectrographic instruction on production of English /i/ and /ɪ/ by Spanish pre-service English teachers. Computer Assisted Language Learning, 27(3), 207–227. doi: 10.1080/09588221.2012.724424
    https://doi.org/10.1080/09588221.2012.724424 [Google Scholar]
  53. R Core Team
    R Core Team (2014) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
  54. Saito, K.
    (2007) The influence of explicit pronunciation instruction on pronunciation in EFL settings: the case of English vowels and Japanese learners of English. The Linguistics Journal, 3(3), 16–40.
    [Google Scholar]
  55. Schwartz, B.
    (1993) On explicit and negative data effecting and affecting competence and linguistic behavior. Studies in Second Language Acquisition, 15, 147–163. doi: 10.1017/S0272263100011931
    https://doi.org/10.1017/S0272263100011931 [Google Scholar]
  56. Stone, M.
    (2005) Preface to the special issue on ultrasound imaging of the tongue. Clinical Linguistics & Phonetics, 19(6–7), 453–454. doi: 10.1080/02699200500113517
    https://doi.org/10.1080/02699200500113517 [Google Scholar]
  57. Suemitsu, A. , Dang, J. , Ito, T. , & Tiede, M.
    (2015) A real-time articulatory visual feedback approach with target presentation for second language pronunciation learning. Journal of the Acoustical Society of America, 138(4), EL382–EL387. doi: 10.1121/1.4931827
    https://doi.org/10.1121/1.4931827 [Google Scholar]
  58. Tateishi, M. , & Winters, S.
    (2013) Does ultrasound training lead to improved perception of a non-native sound contrast? Evidence from Japanese learners of English. Paper presented atthe 2013 meeting of the Canadian Linguistic Association, Victoria, BC, Canada. Retrieved from homes.chass.utoronto.ca/~cla-acl/actes2013/Tateishi_and_Winters-2013.pdf (12 August, 2015).
    [Google Scholar]
  59. Thomson, R. , & Derwing, T.
    (2014) The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3): 326–344. doi: 10.1093/applin/amu076
    https://doi.org/10.1093/applin/amu076 [Google Scholar]
  60. Tilsen, S. , Das, D. , & McKee, B.
    (2015) Real-time articulatory biofeedback with electromagnetic articulography. Linguistics Vanguard, 1(1), 39–55. doi: 10.1515/lingvan‑2014‑1006.
    https://doi.org/10.1515/lingvan-2014-1006 [Google Scholar]
  61. Truscott, J.
    (2007) The effect of error correction on learners’ ability to write accurately. System, 16: 255–272.
    [Google Scholar]
  62. Tsui, H. M.
    (2012) Ultrasound speech training for Japanese adults learning English as a second language (Unpublished MSc thesis). University of British Columbia.
  63. Vardanian, R. M.
    (1964) Teaching English intonation through oscilloscope displays. Language Learning, 14(3–4), 109–117. doi: 10.1111/j.1467‑1770.1964.tb01298.x
    https://doi.org/10.1111/j.1467-1770.1964.tb01298.x [Google Scholar]
  64. Wilson, I.
    (2014) Using ultrasound for teaching and researching articulation. Acoustical Science and Technology, 35(6), 285–289. doi: 10.1250/ast.35.285
    https://doi.org/10.1250/ast.35.285 [Google Scholar]
  65. Wilson, I. , & Gick, B.
    (2006) Ultrasound technology and second language acquisition research. In M. Grantham O’Brien , C. Shea , & J. Archibald (Eds.), Proceedings of the 8th Generative Approaches to Second Language Acquisition Conference (GASLA 2006) (pp.148–152). Somerville, MA: Cascadilla Proceedings Project.
    [Google Scholar]
  66. Wojtczak, J. , & Bonadonna, P.
    (2013) Pocket mobile smartphone system for the point-of-care submandibular ultrasonography. The American Journal of Emergency Medicine, 31, 573–577. doi: 10.1016/j.ajem.2012.09.013
    https://doi.org/10.1016/j.ajem.2012.09.013 [Google Scholar]
  67. Wu, Y. , Gendrot, C. , Hallé, P. , & Adda-Decker, M.
    (2015) On improving the pronunciation of French /r/ in Chinese learners by using real-time ultrasound visualization. Paper presented atthe 2015 International Conference of Phonetic Sciences, Glasgow, Scotland. Retrieved from www.icphs2015.info/pdfs/Papers/ICPHS0786.pdf (12 August, 2015).
    [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): articulation; CAPT; multimodality; segmental features; visual feedback
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error