Volume 68, Issue 5
  • ISSN 0521-9744
  • E-ISSN: 1569-9668
Buy:$35.00 + Taxes



A neutral delivery has often been considered to be the norm in audio description, but it is unclear what a ‘neutral voice’ means. This article begins with a discussion of neutrality in prosody and with a contextualization of AD voicing. It then presents an acoustic analysis of a corpus of audio descriptions in Catalan, English, and Spanish. Based on the results of this analysis, a perception test is designed, and its results are discussed here. The perception test involves participants with sight loss (31 in Spanish, 35 in Catalan, 40 in English) and without sight loss (29 in Spanish, 46 in Catalan, 31 in English) who are asked to define what a neutral voice is for them. Respondents are also asked to select the male and female voices that they consider most neutral. The qualitative analysis of the replies, together with the selection made by participants for both male and female voices across three different languages, sheds some light on how neutrality (or non-neutrality) could be defined. The study does not aim to determine what acoustic features voices should have in the context of audio description but tries to better understand what a neutral voice is, considering that this has traditionally been a frequent expression in research on audio description delivery.


Article metrics loading...

Loading full text...

Full text loading...


  1. AENOR
    AENOR 2005Norma UNE:153030. Audiodescripción para personas con discapacidad visual. Requisitos para la audiodescripción y elaboración de audioguías [Audio description for visually impaired people. Guidelines for audio description procedures and for the preparation of audio guides]. AENOR.
    [Google Scholar]
  2. BAI [Google Scholar]
  3. Banse, Rainer, and Klaus R. Scherer
    1996 “Acoustic Profiles in Vocal Emotion Expression.” Journal of Personality and Social Psychology70 (3): 614–636. 10.1037/0022‑3514.70.3.614
    https://doi.org/10.1037/0022-3514.70.3.614 [Google Scholar]
  4. Boersma, Paul, and David Weenik
    2020Praat: Doing Phonetics by Computer [Computer Program]. Electronic version, v.6.1.37: www.praat.org
    [Google Scholar]
  5. Busso, Carlos, Sungbok Lee, and Shrikanth. S. Narayanan
    2007 “Using Neutral Speech Models for Emotional Speech Analysis.” InProceedings of INTERSPEECH, 2225–2228. https://sail.usc.edu/publications/files/bussointerspeech2007.pdf10.21437/Interspeech.2007‑605
    https://doi.org/10.21437/Interspeech.2007-605 [Google Scholar]
  6. Cabeza-Cáceres, Cristóbal
    2013 “Audiodescripció i recepció. Efecte de la velocitat de narració, l’entonació i l’explicitació en la comprensió fílmica” [Audio description and reception. The effect of speed of narration, intonation and explicitation on film comprehension]. Ph.D. diss., Universitat Autònoma de Barcelona. https://www.tdx.cat/handle/10803/113556
  7. Cornew, Lauren, Leslie J. Carver, and Tracy Love
    2009 “There’s More to Emotion than Meets the Eye: A Processing Bias for Neutral Content in the Domain of Emotional Prosody.” Cognition and Emotion24 (7): 1133–1152. 10.1080/02699930903247492
    https://doi.org/10.1080/02699930903247492 [Google Scholar]
  8. Crangle, Colleen, Rui Wang, Marcos Perreau-Guimaraes, Michelle U. Nguyen, Duc T. Nguyen, and Patrick Suppes
    2019 “Machine Learning for the Recognition of Emotion in the Speech of Couples in Psychotherapy Using the Stanford Suppes Brain Lab Psychotherapy Dataset.” ArXiv. https://arxiv.org/ftp/arxiv/papers/1901/1901.04110.pdf
    [Google Scholar]
  9. Crystal, David
    2008A Dictionary of Linguistics and Phonetics, 6th ed.Malden, MA and Oxford, UK: Blackwell. 10.1002/9781444302776
    https://doi.org/10.1002/9781444302776 [Google Scholar]
  10. De Araújo Carvalho, Wilson J., Bruna Alves Leao, and Charleston Teixeria Palmeira
    2017 “Locuçao e audiodescriçao nos estudios de traduçao audiovisual” [Voicing and audio description in audiovisual translation studies]. Trabalhos em Lingüística Aplicada56 (2). 10.1590/010318138649286277551
    https://doi.org/10.1590/010318138649286277551 [Google Scholar]
  11. Fernández-Torné, Anna, and Anna Matamala
    2015 “Text-to-Speech vs. Human Voiced Aaudio Descriptions: A Reception Study in Films Dubbed into Catalan.” JosTrans: Journal of Specialized Translation241: 61–88.
    [Google Scholar]
  12. Frick, Robert W.
    1985 “Communicating Emotion: The Role of Prosodic Features.” Psychological Bulletin97 (3): 412–429. 10.1037/0033‑2909.97.3.412
    https://doi.org/10.1037/0033-2909.97.3.412 [Google Scholar]
  13. Fryer, Louise
    2016An Introduction to Audio Description: A Practical Guide. Abingdon, Oxon and New York: Routledge. 10.4324/9781315707228
    https://doi.org/10.4324/9781315707228 [Google Scholar]
  14. Grabe, Esther, and Ee Ling Low
    2002 “Durational Variability in Speech and the Rhythm Class Hypothesis.” InLaboratory Phonology 7, edited byCarlos Gussenhoven and Natasha Warner, 515–546. Berlin and New York: De Gruyter Mouton. 10.1515/9783110197105.515
    https://doi.org/10.1515/9783110197105.515 [Google Scholar]
  15. Hareli, Shlomo, Noga Shomrat, and Ursula Hess
    2009 “Emotional versus Neutral Expressions and Perceptions of Social Dominance and Submissiveness.” Emotion9 (3): 378–384. 10.1037/a0015958
    https://doi.org/10.1037/a0015958 [Google Scholar]
  16. Hirvonen, Maija, and Mari Wilund
    2021 “From Image to Text to Speech: The Effects of Speech Prosody on Information Sequencing in Audio Description.” Text and Talk41 (3): 309–334. 10.1515/text‑2019‑0172
    https://doi.org/10.1515/text-2019-0172 [Google Scholar]
  17. Iglesias-Fernández, Emilia
    2010 “La dimensión paralingüística de la audiodescripción: un acercamiento multidisciplinar [The paralinguistic dimension of audio description: a multidisciplinary approach].” InUn corpus de cine. Teoría y práctica de la audiodescripción [A corpus of cinema. Theory and practice of audio description], edited byCatalina Jiménez, Ana Rodríguez, and Claudia Seibel, 205–222. Granada: Ediciones Tragacanto.
    [Google Scholar]
  18. Iglesias-Fernández, Emilia, Silvia Martínez Martínez, and Antonio J. Chica Núñez
    2015 “Cross-fertilization between Reception Studies in Audio Description and Interpreting Quality Assessment: The Role of the Describer’s Voice.” InAudiovisual translation in a Global Context: Mapping an Ever-changing Lanscape, edited byJorge Díaz-Cintas, and Rocío Baños Piñero, 72–95. Bassingstoke: Palgrave Macmillan.
    [Google Scholar]
  19. Iturregui-Gallardo, Gonzalo, and Anna Matamala
    2021 “Audio Subtitling: Dubbing and Voice-Over Effects and Their Impact on User Experience.” Perspectives: Studies in Translatology29 (1): 64–83. 10.1080/0907676X.2019.1702065
    https://doi.org/10.1080/0907676X.2019.1702065 [Google Scholar]
  20. ISO
    ISO 2015Information Technology – User Interface Component Accessibility – Part 21: Guidance on Audio Descriptions (ISO/IEC TS 20071-21).
    [Google Scholar]
  21. ITC
    ITC 2000ITC Guidance on Standards for Audio Description. audiodescription.co.uk/uploads/general/itcguide_sds_audio_desc_word3.pdf
    [Google Scholar]
  22. Kadiri, Sudarsana R., and Paavo Alku
    2020 “Excitation Features of Speech for Speaker Specific Emotion Detection.” IEEE Access, 81: 60382–60391. 10.1109/ACCESS.2020.2982954
    https://doi.org/10.1109/ACCESS.2020.2982954 [Google Scholar]
  23. Kamiloğlu, Roza G., Agneta H. Fischer, and Disa A. Sauter
    2020 “Good Vibrations: A Review of Vocal Expressions of Positive Emotions.” Psychon Bull Review271: 237–265. 10.3758/s13423‑019‑01701‑x
    https://doi.org/10.3758/s13423-019-01701-x [Google Scholar]
  24. Keating, Patricia, and Christina Esposito
    2006 “Linguistic Voice Quality.” UCLA Working Papers in Phonetics1051: 85–91.
    [Google Scholar]
  25. Kraxenberger, Maria, Winfried Menninghaus, Anna Roth, and Mathias Scharinger
    2018 “Prosody-based Sound-Emotion Associations in Poetry.” Frontiers in Psychology 9:1284. 10.3389/fpsyg.2018.01284
    https://doi.org/10.3389/fpsyg.2018.01284 [Google Scholar]
  26. Limbach, Christiane
    2012 “La neutralidad en la audiodescripción fílmica desde un punto de vista traductológico” [Neutrality in filmic audio description from a translation point of view]. Ph.D. diss., Universidad de Granada. https://digibug.ugr.es/bitstream/handle/10481/24487/21403144.pdf?sequence=1&isAllowed=y
  27. López, Mariana, Gavin Kearney, and Krisztian Hofstadter
    2021 “Enhancing Audio description: Inclusive Cinematic Experiences Through Sound Design.” JosTrans: Journal of Audiovisual Translation4 (1): 157–182. 10.47476/jat.v4i1.2021.154
    https://doi.org/10.47476/jat.v4i1.2021.154 [Google Scholar]
  28. Machuca, María J., Anna Matamala, and Antonio Ríos
    2020a “Prosodic Features in Spanish Audio Descriptions of the VIW Corpus.” MonTI121: 53–77. 10.6035/MonTI.2020.12.02
    https://doi.org/10.6035/MonTI.2020.12.02 [Google Scholar]
  29. 2020b “Los audiodescriptores: voces neutras y voces agradables” [Audio describers: neutral voices and pleasant voices]. Loquens7 (2): e076. 10.3989/loquens.2020.076
    https://doi.org/10.3989/loquens.2020.076 [Google Scholar]
  30. Maszerowska, Anna, Anna Matamala, and Pilar Orero
    (eds.) 2014Audio Description: New Perspectives Illustrated. Amsterdam and Philadelphia: John Benjamins Publishing Company. 10.1075/btl.112
    https://doi.org/10.1075/btl.112 [Google Scholar]
  31. Matamala, Anna
    2018 “One Short Film, Different Audio Descriptions: Analyzing the Language of Audio Descriptions Created by Students and Professionals.” Onomazéin411: 185–207. 10.7764/onomazein.41.04
    https://doi.org/10.7764/onomazein.41.04 [Google Scholar]
  32. Matamala, Anna, Olga Soler-Vilageliu, Gonzalo Iturregui-Gallardo, Anna Jankowska, Jorge L. Méndez-Ulrich, and Anna Serrano
    2020 “Electrodermal Activity as a Measure of Emotions in Media Accessibility Research: Methodological Considerations.” JosTrans: Journal of Specialized Translation331: 129–151.
    [Google Scholar]
  33. Matamala, Anna, and Marta Villegas
    2016 “Building an Audio Description Multilingual Multimodal Corpus: The VIW Project.” InProceedings: Multimodal Corpora: Computer Vision and Language processing- MMC2016, edited byJens Edlund, Dirk Heylen, and Patrizia Paggio, 29–32. ELRA. www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-MCC-2016-proceedings.pdf
    [Google Scholar]
  34. Mazur, Iwona, and Agnieszka Chmiel
    2012 “Audio Description Made to Measure: Reflections on Interpretation in Audio Description Based on the Pear Tree Project Data.” InAudiovisual Translation and Media Accessibility at the Crossroads, edited byAline Remael, Pilar Orero, and Mary Carroll, 173–188. Amsterdam: Rodopi.
    [Google Scholar]
  35. Morisset, Laure, and Frédéric Gonant
    2008La charte de l’audiodescription [The audio description sheet]. Ministère des Affaires Sociales et de la Santé. https://www.sdicine.fr/wp-content/uploads/2015/05/Charte-de-laudio-description-1008.pdf
    [Google Scholar]
  36. Netflix [Google Scholar]
  37. Nordström, Henrik
    2019 “Emotional Communication in the Human Voice.” Ph.D. diss., Stockholm University. https://www.diva-portal.org/smash/get/diva2:1304804/FULLTEXT01.pdf
  38. Pell, Marc D., Silke Paulmann, Chinar Dara, Areej Alasseri, and Sonja A. Kotz
    2009 “Factors in the Recognition of Vocally Expressed Emotions: A Comparison of Four Languages.” Journal of Phonetics37 (4): 417–435. 10.1016/j.wocn.2009.07.005
    https://doi.org/10.1016/j.wocn.2009.07.005 [Google Scholar]
  39. Quinto, Lena, William F. Thompson, and Felicity L. Keating
    2013 “Emotional Communication in Speech and Music: The Role of Melodic and Rhythmic Contrasts.” Frontiers in Psychology41. 10.3389/fpsyg.2013.00184
    https://doi.org/10.3389/fpsyg.2013.00184 [Google Scholar]
  40. Remael, Aline, Nina Reviers, and Gert Vercauteren
    2015Pictures Painted in Words. ADLAB Audio Description Guidelines. Trieste: Edizione Università di Trieste.
    [Google Scholar]
  41. Rodríguez González de Antona, Alicia
    2017 “Audiodescripción y propuesta de estandarización de volumen de audio: hacia una producción sistematizada” [Audio description and audio volume standardization proposal: towards a systematized production]. Ph.D. diss., Universitat Autònoma de Barcelona. https://www.tesisenred.net/handle/10803/459247#page=1
  42. Scherer, Klaus R.
    1986 “Vocal Affect Expression: A Review and a Model for Future Research.” Psychological Bulletin991: 143. 10.1037/0033‑2909.99.2.143
    https://doi.org/10.1037/0033-2909.99.2.143 [Google Scholar]
  43. 2003 “Vocal Communication of Emotion: A Review of Research Paradigms.” Speech Communication401: 227–256. 10.1016/S0167‑6393(02)00084‑5
    https://doi.org/10.1016/S0167-6393(02)00084-5 [Google Scholar]
  44. 2013 “Emotion in Action, Interaction, Music, and Speech.” InLanguage, Music, and the Brain: A Mysterious Relationship, edited byMichael A. Arbib, 107–139. Cambridge, MA: MIT Press. 10.7551/mitpress/9780262018104.003.0005
    https://doi.org/10.7551/mitpress/9780262018104.003.0005 [Google Scholar]
  45. Snyder, Joel
    2014The Visual Made Verbal: A Comprehensive Training Manual and Guide to the History and Applications of Audio Description. Arlington, VA: American Council for the Blind.
    [Google Scholar]
  46. Stolarski, Łukasz
    2015 “Pitch Patterns in Vocal Expression of ‘Happiness’ and ‘Sadness’ in the Reading aloud of Prose on the Basis of Selected Audiobooks.” Research in Language131: 141–162. 10.1515/rela‑2015‑0016
    https://doi.org/10.1515/rela-2015-0016 [Google Scholar]
  47. Szarkowska, Agnieszka
    2011 “Text-to-Speech Audio Description: Towards Wider Availability of Audio dDescription.” JosTrans: Journal of Specialized Translation151: 142–162.
    [Google Scholar]
  48. Szymańska, Barbara, and Tomasz Strzymiński
    2010 “Standardy tworzenia audiodeskrypcji do produkcji audiowizualnych” [Standards for creating audio description for audiovisual productions]. Audio Deskrypcja [Audio description]. www.audiodeskrypcja.org.pl/standardy-tworzenia-audiodeskrypcji.html
    [Google Scholar]
  49. Ververidis, Dimitrios, and Constantine Kotropoulos
    2006 “Emotional Speech Recognition: Resources, Features, and Methods.” Speech Communication48 (9): 1162–1181. 10.1016/j.specom.2006.04.003
    https://doi.org/10.1016/j.specom.2006.04.003 [Google Scholar]
  50. Villoslada, Ana
    2018 “La lengua de la audiodescripción del guion audiodescrito de Ocho apellidos vascos” [The language of the audio description of the audio described script of Ocho apellidos vascos]. TRANS: revista de traductología221: 61–80. 10.24310/TRANS.2018.v0i22.3003
    https://doi.org/10.24310/TRANS.2018.v0i22.3003 [Google Scholar]
  51. Walczak, Agnieszka, and Louise Fryer
    2018 “Vocal Delivery of Audio Description by Genre: Measuring Users’ Presence.” Perspectives: Studies in Translatology26 (1): 69–83. 10.1080/0907676X.2017.1298634
    https://doi.org/10.1080/0907676X.2017.1298634 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error