Volume 4, Issue 2
  • ISSN 2215-1931
  • E-ISSN: 2215-194X
Buy:$35.00 + Taxes



Perceivers’ attention is entrained to the rhythm of a speaker’s gestural and acoustic beats. When different rhythms (polyrhythms) occur across the visual and auditory modalities of speech simultaneously, attention may be heightened, enhancing memorability of the sequence. In this three-stage study, Stage 1 analyzed videorecordings of native English-speaking instructors, focusing on frame-by-frame analysis of time-aligned annotations from Praat and Anvil (video annotation tool) of polyrhythmic sequences. Stage 2 explored the perceivers’ perspective on the sequences’ discourse role. Stage 3 analyzed 10 international teaching assistants’ gestures, and implemented a multistep technology-assisted program to enhance verbal and nonverbal communication skills. Findings demonstrated (a) a dynamic temporal gesture-speech relationship involving perturbations of beat intervals surrounding pitch-accented vowels, (b) the sequences’ important role as highlighters of information, and (c) improvement of ITA confidence, teaching effectiveness, and ability to communicate important points. Findings support the joint production of gesture and prosodically prominent features.


Article metrics loading...

Loading full text...

Full text loading...


  1. Allen, G. D.
    (1972) The location of rhythmic stress beats in English speech. Parts I & II. Language and Speech, 15, 72–100, 179–195. 10.1177/002383097201500110
    https://doi.org/10.1177/002383097201500110 [Google Scholar]
  2. Bailey, K. M.
    (1982) Teaching in a second language: The communicative competence of non-native speaking assistants (Unpublished doctoral dissertation). University of California, Los Angeles.
  3. Beckman, M. E., Hirschberg, J., & Shattuck-Hufnagel, S.
    (2005) The original ToBI system and the evolution of the ToBI framework (pp. 9–54). InS. Jun (Ed.), Prosodic typology: The phonology of intonation and phrasing. Oxford: Oxford University Press. 10.1093/acprof:oso/9780199249633.003.0002
    https://doi.org/10.1093/acprof:oso/9780199249633.003.0002 [Google Scholar]
  4. Beckman, M. E., & Pierrehumbert, J.
    (1986) Intonational structure in Japanese and English. Phonology Yearbook3, 255–309. 10.1017/S095267570000066X
    https://doi.org/10.1017/S095267570000066X [Google Scholar]
  5. Bergeson, T. R., Pisoni, D. B., & Davis, R. A. O.
    (2003) A longitudinal study of audiovisual speech perception by children with hearing loss who have cochlear implants. The Volta Review, 103, 347–370.
    [Google Scholar]
  6. Biau, E., & Soto-Faraco, S.
    (2013) Beat gestures modulate auditory integration in speech perception. Brain and Language, 124, 143–152. doi:  10.1016/j.bandl.2012.10.008
    https://doi.org/10.1016/j.bandl.2012.10.008 [Google Scholar]
  7. Birdwhistell, R.
    (1970) Kinesics and context: Essay on body motion communication. Philadelphia, PA: University of Pennsylvania Press.
    [Google Scholar]
  8. Boersma, P., & Weenink, D.
    (2014) Praat: Doing phonetics by computer [Computer program]. Retrieved from www.fon.hum.uva.nl/praat/
  9. Bolinger, D.
    (1986) Intonation and its parts: Melody in spoken English. Stanford, CA: Stanford University Press.
    [Google Scholar]
  10. Bull, P., & Connelly, G.
    (1985) Body movement and emphasis in speech. Journal of Nonverbal Behavior, 9, 169–187. doi:  10.1007/BF01000738
    https://doi.org/10.1007/BF01000738 [Google Scholar]
  11. Chafe, W.
    (1994) Discourse, consciousness, and time: The flow and displacement of conscious experience in speaking and writing. Chicago, IL: The University of Chicago Press.
    [Google Scholar]
  12. Clark, J. M., & Paivio, A.
    (1991) Dual coding theory and education. Educational Psychology Review, 3, 149–210. 10.1007/BF01320076
    https://doi.org/10.1007/BF01320076 [Google Scholar]
  13. Clayton, M., Sager, R., & Will, U.
    (2005) In time with the music: The concept of entrainment and its significance for ethnomusicology. ESEM (European Seminar in Ethnomusicology) CounterPoint, 1, 1–82.
    [Google Scholar]
  14. Condon, W.
    (1976) An analysis of behavioral organization. Sign Language Studies, 5, 285–318. 10.1353/sls.1976.0001
    https://doi.org/10.1353/sls.1976.0001 [Google Scholar]
  15. de Ruiter, J. P.
    (2000) The production of gesture and speech. InD. McNeill (Ed.), Language and gesture (pp. 284–311). Cambridge: Cambridge University Press. 10.1017/CBO9780511620850.018
    https://doi.org/10.1017/CBO9780511620850.018 [Google Scholar]
  16. Dimitrova, D., Chu, M., Wang, L., Özyürek, A., & Hagoort, P.
    (2016) Beat that word: How listeners integrate beat gesture and focus on multimodal speech discourse. Journal of Cognitive Neuroscience, 28, 1255–1269. doi:  10.1162/jocn_a_00963
    https://doi.org/10.1162/jocn_a_00963 [Google Scholar]
  17. Flecha-García, M. L.
    (2010) Eyebrow raises in dialogue and their relation to discourse structure, utterance function and pitch accents in English. Speech Communication, 52, 542–554. doi:  10.1016/j.specom.2009.12.003
    https://doi.org/10.1016/j.specom.2009.12.003 [Google Scholar]
  18. Flowerdew, J., & Tauroza, S.
    (1995) The effect of discourse markers on second language lecture comprehension. Studies in Second Language Acquisition, 17, 435–458. doi:  10.1017/S0272263100014406
    https://doi.org/10.1017/S0272263100014406 [Google Scholar]
  19. Fraisse, P.
    (1982) Rhythm and tempo. InD. Deutsch (Ed.), The psychology of music (pp. 149–180). New York, NY: Academic Press. 10.1016/B978‑0‑12‑213562‑0.50010‑3
    https://doi.org/10.1016/B978-0-12-213562-0.50010-3 [Google Scholar]
  20. Goldin-Meadow, S., & Alibali, M. W.
    (2013) Gesture’s role in speaking, learning, and creating language. Annual Review of Psychology, 64, 257–283. doi:  10.1146/annurev‑psych‑113011‑143802
    https://doi.org/10.1146/annurev-psych-113011-143802 [Google Scholar]
  21. Gorsuch, G. J.
    (2003) The educational cultures of international teaching assistants and U.S. universities. TESL-EJ, 7. www.tesl-ej.org/wordpress/issues/volume7/ej27/ej27a1/
    [Google Scholar]
  22. Gregersen, T., Olivares-Cuhat, G., & Storm, J.
    (2009) An examination of L1 and L2 gesture use: What role does proficiency play?The Modern Language Journal, 93, 195–208. doi:  10.1111/j.1540‑4781.2009.00856.x
    https://doi.org/10.1111/j.1540-4781.2009.00856.x [Google Scholar]
  23. Gullberg, M.
    (2011) Multilingual multimodality: Communicative difficulties and their solutions in second language use. InJ. Streeck, C. Goodwin, & C. LeBaron (Eds.), Embodied interaction: Language and body in the material world (pp. 137–151). Cambridge: Cambridge University Press.
    [Google Scholar]
  24. Hardison, D. M.
    (2003) Acquisition of second-language speech: Effects of visual cues, context and talker variability. Applied Psycholinguistics, 24, 495–522. doi:  10.1017/S0142716403000250
    https://doi.org/10.1017/S0142716403000250 [Google Scholar]
  25. (2004) Generalization of computer-assisted prosody training: Quantitative and qualitative findings. Language Learning & Technology, 8, 34–52. llt.msu.edu/vol8num1/pdf/hardison.pdf
    [Google Scholar]
  26. (2005) Contextualized computer-based L2 prosody training: Evaluating the effects of discourse context and video input. CALICO Journal, 22, 175–190. doi:  10.1558/cj.v22i2.175‑190
    https://doi.org/10.1558/cj.v22i2.175-190 [Google Scholar]
  27. (2012) Second-language speech perception: A cross-disciplinary perspective on challenges and accomplishments. InS. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 349–363). London: Routledge.
    [Google Scholar]
  28. Hincks, R., & Edlund, J.
    (2009) Promoting increased pitch variation in oral presentations with transient visual feedback. Language Learning & Technology, 13, 32–50. llt.msu.edu/vol13num3/hincksedlund.pdf
    [Google Scholar]
  29. Inceoglu, S.
    (2016) Effects of perceptual training on second language vowel perception and production. Applied Psycholinguistics, 37, 1175–1199. doi:  10.1017/S0142716415000533
    https://doi.org/10.1017/S0142716415000533 [Google Scholar]
  30. Jenkins, S., & Parra, I.
    (2003) Multiple layers of meaning in an oral proficiency test: The complementary roles of nonverbal, paralinguistic, and verbal behaviors in assessment decisions. The Modern Language Journal, 87, 90–107. doi:  10.1111/1540‑4781.00180
    https://doi.org/10.1111/1540-4781.00180 [Google Scholar]
  31. Jones, M. R.
    (1986) Attentional rhythmicity in human perception. InJ. R. Evans & M. Clynes (Eds.), Rhythm in psychological, linguistic, and music processes (pp. 13–40). Springfield, IL: Charles C. Thomas.
    [Google Scholar]
  32. Jun, S. -A.
    (2005) Korean intonational phonology and prosodic transcription. InS. -A. Jun (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 201–229). Oxford: Oxford University Press. doi: 10.1093/acprof:oso/9780199249633.0030008
    https://doi.org/10.1093/acprof:oso/9780199249633.0030008 [Google Scholar]
  33. Kellerman, E., & Van Hoof, A. -M.
    (2003) Manual accents. International Review of Applied Linguistics, 41, 251–269. doi:  10.1515/iral.2003.011
    https://doi.org/10.1515/iral.2003.011 [Google Scholar]
  34. Kendon, A.
    (1972) Some relationships between body motion and speech: An analysis of an example. InA. Siegman & B. Pope (Eds.), Studies in dyadic communication (pp. 177–210). New York, NY: Pergamon Press. 10.1016/B978‑0‑08‑015867‑9.50013‑7
    https://doi.org/10.1016/B978-0-08-015867-9.50013-7 [Google Scholar]
  35. Kipp, M.
    (2001) Anvil – A generic annotation tool for multimodal dialogue. InProceedings of the 7th European Conference on Speech Communication and Technology (pp. 1367–1370). Aalborg, Denmark: Eurospeech.
    [Google Scholar]
  36. Krahmer, E., & Swerts, M.
    (2007) The effects of visual beats on prosodic prominence: Acoustic analyses, auditory perception and visual perception. Journal of Memory and Language, 57, 396–414. doi:  10.1016/j.jml.2007.06.005
    https://doi.org/10.1016/j.jml.2007.06.005 [Google Scholar]
  37. Large, E. W., & Jones, M. R.
    (1999) The dynamics of attending: How people track time-varying events. Psychological Review, 106, 119–159. 10.1037/0033‑295X.106.1.119
    https://doi.org/10.1037/0033-295X.106.1.119 [Google Scholar]
  38. Leonard, T., & Cummins, F.
    (2011) The temporal relation between beat gestures and speech. Language and Cognitive Processes, 26, 1457–1471. doi: 10.1080/01690965.29010.500218
    https://doi.org/10.1080/01690965.29010.500218 [Google Scholar]
  39. Levelt, W. J. M.
    (1989) Speaking: From intention to articulation. Cambridge, MA: The MIT Press.
    [Google Scholar]
  40. Loehr, D.
    (2007) Aspects of rhythm in gesture and speech. Gesture, 7, 179–214. 10.1075/gest.7.2.04loe
    https://doi.org/10.1075/gest.7.2.04loe [Google Scholar]
  41. London, J.
    (2012) Hearing in time: Psychological aspects of musical meter (2nd ed.). Oxford: Oxford University Press. 10.1093/acprof:oso/9780199744374.001.0001
    https://doi.org/10.1093/acprof:oso/9780199744374.001.0001 [Google Scholar]
  42. McCafferty, S. G.
    (2002) Gesture and creating zones of proximal development for second language learning. The Modern Language Journal, 86, 192–203. doi:  10.1111/1540‑4781.00144
    https://doi.org/10.1111/1540-4781.00144 [Google Scholar]
  43. McClave, E.
    (1994) Gestural beats: The rhythm hypothesis. Journal of Psycholinguistic Research, 23, 45–66. 10.1007/BF02143175
    https://doi.org/10.1007/BF02143175 [Google Scholar]
  44. McCroskey, J. C., Richmond, V. P., Sallinen, A., Fayer, J. M. & Barraclough, R. A.
    (1995) A cross-cultural and multi-behavioral analysis of the relationship between nonverbal immediacy and teacher evaluation. Communication Education, 44, 281–291. 10.1080/03634529509379019
    https://doi.org/10.1080/03634529509379019 [Google Scholar]
  45. Mehrabian, A.
    (1969) Methods & designs: Some referents and measures of nonverbal behavior. Behavior Research Methods and Instrumentation, 1, 203–207. 10.3758/BF03208096
    https://doi.org/10.3758/BF03208096 [Google Scholar]
  46. McNeill, D.
    (1992) Hand and mind: What gestures reveal about thought. Chicago, IL: University of Chicago Press.
    [Google Scholar]
  47. (2006) Gesture and communication. Inthe Encyclopedia of language and linguistics (2nd ed.) (pp. 58–66). Amsterdam: Elsevier. 10.1016/B0‑08‑044854‑2/00798‑7
    https://doi.org/10.1016/B0-08-044854-2/00798-7 [Google Scholar]
  48. McNeill, D., Levy, E. T., & Duncan, S. D.
    (2015) Gesture in discourse. InD. Tannen, H. E. Hamilton, & D. Schiffrin (Eds.), The handbook of discourse analysis (2nd ed.) (pp. 262–289). Hoboken, NJ: Wiley & Sons.
    [Google Scholar]
  49. Meltzoff, A. N., & Kuhl, P. K.
    (1994) Faces and speech: Intermodal processing of biologically relevant signals in infants and adults. InD. J. Lewkowitz & R. Lickliter (Eds.), The development of intersensory perception: Comparative perspectives (pp. 335–369). Hillsdale, NJ: Lawrence Erlbaum Associates.
    [Google Scholar]
  50. Merker, B. H., Madison, G. S., & Eckerdal, P.
    (2009) On the role and origin of isochrony in human rhythmic entrainment. Cortex, 45, 4–17. doi:  10.1016/j.cortex.2008.06.011
    https://doi.org/10.1016/j.cortex.2008.06.011 [Google Scholar]
  51. Mori, J., & Hayashi, M.
    (2006) The achievement of intersubjectivity through embodied completions: A study of interactions between first and second language speakers. Applied Linguistics, 27, 195–219. doi:  10.1093/applin/aml014
    https://doi.org/10.1093/applin/aml014 [Google Scholar]
  52. Munhall, K. G., Jones, J. A., Callan, D. E., Kuratate, T., Vatikiotis-Bateson, E.
    (2004) Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15, 133–137. doi:  10.1111/j.0963‑7214.2004.01502010.x
    https://doi.org/10.1111/j.0963-7214.2004.01502010.x [Google Scholar]
  53. Munhall, K. G., & Tohkura, Y.
    (1998) Audiovisual gating and the time course of speech perception. Journal of the Acoustical Society of America, 104, 530–539. doi:  10.1121/1.423300
    https://doi.org/10.1121/1.423300 [Google Scholar]
  54. Neu, J.
    (1990) Assessing the role of nonverbal communication in the acquisition of communicative competence in L2. InR. C. Scarcella, E. S. Andersen, & S. D. Krashen (Eds.), Developing communicative competence in a second language (pp. 121–138). Boston, MA: Heinle and Heinle.
    [Google Scholar]
  55. Port, R. F.
    (2003) Meter and speech. Journal of Phonetics, 31, 599–611. doi:  10.1016/j.wocn.2003.08.001
    https://doi.org/10.1016/j.wocn.2003.08.001 [Google Scholar]
  56. Rosenblum, L.
    (2010) See what I’m saying: The extraordinary powers of our five senses. New York, NY: W.W. Norton & Co.
    [Google Scholar]
  57. Rusiewicz, H. L., Shaiman, S., Iverson, J. M., & Szuminsky, N.
    (2013) Effects of prosody and position on the timing of deictic gestures. Journal of Speech, Language, and Hearing Research, 56, 458–470. doi:  10.1044/1092‑4388(2012/11‑0283)
    https://doi.org/10.1044/1092-4388(2012/11-0283) [Google Scholar]
  58. Skipper, J. I., van Wassenhove, V., Nusbaum, H. W., & Small, S. L.
    (2007) Hearing lips and seeing voices: How cortical areas supporting speech production mediate audiovisual speech perception. Cerebral Cortex, 17, 2387–2399. doi:  10.1093/cercor/bhl147
    https://doi.org/10.1093/cercor/bhl147 [Google Scholar]
  59. Sueyoshi, A., & Hardison, D. M.
    (2005) The role of gestures and facial cues in second-language listening comprehension. Language Learning, 55, 661–699. doi:  10.1111/j.0023‑8333.2005.00320.x
    https://doi.org/10.1111/j.0023-8333.2005.00320.x [Google Scholar]
  60. Summerfield, Q.
    (1979) Use of visual information for phonetic perception. Phonetica, 36, 314–331. 10.1159/000259969
    https://doi.org/10.1159/000259969 [Google Scholar]
  61. Tajima, K., & Port, R. F.
    (2003) Speech rhythm in English and Japanese. InJ. Local, R. Ogden, & R. Temple (Eds.), Phonetic interpretation: Papers in laboratory phonology VI (pp. 317–334). Cambridge: Cambridge University Press.
    [Google Scholar]
  62. Treffner, P., Peter, M., & Kleidon, M.
    (2008) Gestures and phases: The dynamics of speech-hand communication. Ecological Psychology, 20, 32–64. doi:  10.1080/10407410701766643
    https://doi.org/10.1080/10407410701766643 [Google Scholar]
  63. Tuite, K.
    (1993) The production of gesture. Semiotica, 93, 83–105. doi:  10.1515/semi.1993.93.1‑2.83
    https://doi.org/10.1515/semi.1993.93.1-2.83 [Google Scholar]
  64. Volterra, V., & Erting, C. J.
    (Eds.) (2002) From gesture to language in hearing and deaf children. Washington, DC: Gallaudet University Press.
    [Google Scholar]
  65. Wennerstrom, A.
    (2001) The music of everyday speech: Prosody and discourse analysis. Oxford: Oxford University Press.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error