1887
Volume 21, Issue 2-3
  • ISSN 1568-1475
  • E-ISSN: 1569-9773
USD
Buy:$35.00 + Taxes

Abstract

Abstract

This study presents an automatic tool that allows to trace smile intensities along a video record of conversational face-to-face interactions. The processed output proposes a sequence of adjusted time intervals labeled following the (Gironzetti, Attardo, and Pickering, 2016), a 5 levels scale varying from neutral facial expression to laughing smile. The underlying statistical model of this tool is trained on a manually annotated corpus of conversations featuring spontaneous facial expressions. This model will be detailed in this study. This tool can be used with benefits for annotating smile in interactions. The results are twofold. First, the evaluation reveals an observed agreement of 68% between manual and automatic annotations. Second, manually correcting the labels and interval boundaries of the automatic outputs reduces by a factor 10 the annotation time as compared with the time spent for manually annotating smile intensities without pretreatment. Our annotation engine makes use of the state-of-the-art toolbox OpenFace for tracking the face and for measuring the intensities of the facial Action Units of interest all along the video. The documentation and the scripts of our tool, the SMAD software, are available to download at the HMAD open source project URL page https://github.com/srauzy/HMAD (last access 31 July 2023).

Loading

Article metrics loading...

/content/journals/10.1075/gest.22012.rau
2023-09-01
2025-07-13
Loading full text...

Full text loading...

References

  1. Alibali, M. W., Kita, S., & Young, A. J.
    (2000) Gesture and the process of speech production: We think, therefore we gesture. Language and cognitive processes, 15(6), 593–613. 10.1080/016909600750040571
    https://doi.org/10.1080/016909600750040571 [Google Scholar]
  2. Amoyal, M., & Priego-Valverde, B.
    (2019) Smiling for negotiating topic transitions in French conversation. Gesture and Speech in Interaction, proceedings of Gespin 2019. Sep 2019, Paderborn, Germany.
    [Google Scholar]
  3. An, L., Yang, S., & Bhanu, B.
    (2015) Efficient smile detection by extreme learning machine. Neurocompution, 149(PA), 354–363. 10.1016/j.neucom.2014.04.072
    https://doi.org/10.1016/j.neucom.2014.04.072 [Google Scholar]
  4. Argyle, M.
    (1975) Bodily communication. Methuen: London.
    [Google Scholar]
  5. Artstein, R., & Poesio, M.
    (2008) Inter-coder agreement for computational linguistics. Comput. Linguist., 34(4), 555–596. 10.1162/coli.07‑034‑R2
    https://doi.org/10.1162/coli.07-034-R2 [Google Scholar]
  6. Baltrušaitis, T., Mahmoud, M., & Robinson, P.
    (2015) Cross-dataset learning and person specific normalisation for automatic action unit detection. Proceedings of the 11th ieee international conference on automatic face and gesture recognition, 1–6. Ljubljana, Slovenia, May 2015. 10.1109/FG.2015.7284869
    https://doi.org/10.1109/FG.2015.7284869 [Google Scholar]
  7. Baltrušaitis, T., Robinson, P., & Morency, L.-P.
    (2012) 3D constrained local model for rigid and non-rigid facial tracking. Proceedings of the ieee computer society conference on computer vision and pattern recognition (CVPR 2012), pp.2610–2617, Providence, RI, USA, June 2012. 10.1109/CVPR.2012.6247980
    https://doi.org/10.1109/CVPR.2012.6247980 [Google Scholar]
  8. (2013) Constrained local neural fields for robust facial landmark detection in the wild. Proceedings of the 2013 ieee international conference on computer vision workshops (pp.354–361). Portland, OR, USA, June 2013, 10.1109/ICCVW.2013.54
    https://doi.org/10.1109/ICCVW.2013.54 [Google Scholar]
  9. (2016) OpenFace: An open source facial behavior analysis toolkit. Proceedings of the Ieee winter conference on applications of computer vision. pp.1–10, Lake Placid, NY, USA 2016 10.1109/WACV.2016.7477553
    https://doi.org/10.1109/WACV.2016.7477553 [Google Scholar]
  10. Baltrušaitis, T., Zadeh, A., Lim, Y. C., & Morency, L.-P.
    (2018) Openface 2.0: Facial behavior analysis toolkit. InProceedings of the 13th ieee international conference on automatic face gesture recognition (FG 2018) (pp.59–66). Xi’an, China 2018, 10.1109/FG.2018.00019
    https://doi.org/10.1109/FG.2018.00019 [Google Scholar]
  11. Barrier, G.
    (2013) La communication non verbale: Comprendre les gestes: Perception et signification. Issy-les–
    [Google Scholar]
  12. Bartlett, M. S., Littlewort, G. C., Braathen, B., Sejnowski, T. J., & Movellan, J. R.
    (2003) A prototype for automatic recognition of spontaneous facial actions. Advances in Neural Information Processing Systems, 151, 1271–1278.
    [Google Scholar]
  13. Bartlett, M. S., Littlewort, G. C., Franck, M. G., Lainscsek, C., Fasel, I. R., & Movellan, J. R.
    (2006) Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia, 1(6), 22–35. 10.4304/jmm.1.6.22‑35
    https://doi.org/10.4304/jmm.1.6.22-35 [Google Scholar]
  14. Bateson, G., Winkin, Y., Bansard, D., Cardoen, A., & Birdwhistell, R.
    (1981) La nouvelle communication. Paris: Ed. du Seuil.
    [Google Scholar]
  15. Bavelas, J. B., & Gerwing, J.
    (2007) Conversational Hand Gestures and Facial Displays in Face-to-Face Dialogue. InK. Fiedler (Ed.), Social communication (pp.283–308). Psychology Press.
    [Google Scholar]
  16. Brugman, H., Russel, A., & Nijmegen, X.
    (2004) Annotating Multi-media/Multi-modal resources with ELAN. InM. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Language Evaluation (LREC 2004) (pp.2065–2068). Paris: European Language Resources Association.
    [Google Scholar]
  17. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., & Sheikh, Y. A.
    (2019) Openpose: Realtime multi-person 2d pose estimation using part affinity fields. Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence. 43(1):172–186. 10.1109/TPAMI.2019.2929257
    https://doi.org/10.1109/TPAMI.2019.2929257 [Google Scholar]
  18. Carletta, J.
    (1996) Assessing agreement on classification tasks: The kappa statistic. Comput. Linguist., 22(2), 249–254. Retrieved fromdl.acm.org/citation.cfm?id=230386.230390 (last access1 August 2023).
    [Google Scholar]
  19. Chen, J., Ou, Q., Chi, Z., & Fu, H.
    (2017) Smile detection in the wild with deep convolutional neural networks. Machine Vision and Applications, 28(1), 173–183. 10.1007/s00138‑016‑0817‑z
    https://doi.org/10.1007/s00138-016-0817-z [Google Scholar]
  20. Cohen, J.
    (1960) A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1), 37–46. 10.1177/001316446002000104
    https://doi.org/10.1177/001316446002000104 [Google Scholar]
  21. Cohn, J. F., & De la Torre, F.
    (2014) Automated face analysis for affective computing. InR. Calvo, S. D’Mello, J. Gratch, & A. Kappas (Eds.), The Oxford handbook of affective computing (pp.131–150). Oxford: Oxford University Press.
    [Google Scholar]
  22. Dempster, A. P., Laird, N. M., & Rubin, D. B.
    (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal Of The Royal Statistical Society, Series B, 39(1), 1–38.
    [Google Scholar]
  23. Dhall, A., Goecke, R., Gedeon, T., & Sebe, N.
    (2016) Emotion recognition in the wild. Journal on Multimodal User Interfaces, 10(2), 95–97. 10.1007/s12193‑016‑0213‑z
    https://doi.org/10.1007/s12193-016-0213-z [Google Scholar]
  24. Ekman, P., Friesen, W., & Hager, J.
    (2002) Facial action coding system: Research nexus. Salt Lake City, UT: Network Research Information.
    [Google Scholar]
  25. Ekman, P.
    (1984) Expression and the nature of emotion. InK. Scherer & P. Ekman (Eds.), Approaches to Emotion (pp.319–344). Hillsdale, NJ: Lawrence Erlbaum.
    [Google Scholar]
  26. Ekman, P., Davidson, R. J., & Friesen, W. V.
    (1990) The Duchenne smile: Emotional expression and brain physiology: II. Journal of Personality and Social Psychology, 58(2), 342–353. 10.1037/0022‑3514.58.2.342
    https://doi.org/10.1037/0022-3514.58.2.342 [Google Scholar]
  27. Ekman, P., & Friesen, W.
    (1975) Unmasking the face : A guide to recognizing emotions from facial clues. Englewood Cliffs: Prentice-hall.
    [Google Scholar]
  28. Ekman, P., & Friesen, W. V.
    (1978) Facial action coding system: Manual. Palo Alto, CA, USA: Consulting Psychologists Press.
    [Google Scholar]
  29. El Haddad, K., Chakravarthula, S. N., & Kennedy, J.
    (2019) Smile and laugh dynamics in naturalistic dyadic interactions: Intensity levels, sequences and roles. Proceedings of the ACM international conference on multimodal interaction (pp.259–263). Suzhou, Jiangsu, China. October 2019 10.1145/3340555.3353764
    https://doi.org/10.1145/3340555.3353764 [Google Scholar]
  30. Fleiss, J. L.
    (1971) Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378–382. 10.1037/h0031619
    https://doi.org/10.1037/h0031619 [Google Scholar]
  31. Forney, G. D.
    (1973) The Viterbi algorithm. Proceedings of IEEE, 61(3), 268–278. 10.1109/PROC.1973.9030
    https://doi.org/10.1109/PROC.1973.9030 [Google Scholar]
  32. Freire-Obregón, D., & Castrillón-Santana, M.
    (2015) An evolutive approach for smile recognition in video sequences. International Journal of Pattern Recognition and Artificial Intelligence, 29(01), 17pages. 10.1142/S0218001415500068
    https://doi.org/10.1142/S0218001415500068 [Google Scholar]
  33. Girard, J. M., Cohn, J. F., & De la Torre, F.
    (2015) Estimating smile intensity: A better way. Pattern Recognition Letters, 661, 13–21. 10.1016/j.patrec.2014.10.004
    https://doi.org/10.1016/j.patrec.2014.10.004 [Google Scholar]
  34. Gironzetti, E., Attardo, S., & Pickering, L.
    (2016) Smiling, gaze, and humor in conversation: A pilot study. InL. Ruiz-Gurillo (Ed.), Metapragmatics of humor: Current research trends (pp.235–254). This is a contribution from Metapragmatics of Humor. Current research trends. 2016: John Benjamins Publishing Company. 10.1075/ivitra.14.12gir
    https://doi.org/10.1075/ivitra.14.12gir [Google Scholar]
  35. Gorisch, J., & Prévot, L.
    (2014) Aix-DVD, LPL. Retrieved fromhttps://www.ortolang.fr/market/corpora/sldr000891?lang=en (last access1 August 2023).
  36. Goujon, A., Bertrand, R., & Tellier, M.
    (2015) Eyebrows in French talk-in-interaction. InProceedings of the Gesture and speech in interaction – 4th edition (Gespin 4) (pp.125–130). Nantes, France.
    [Google Scholar]
  37. Guo, X., Polania, L., & Barner, K.
    (2018) Smile detection in the wild based on transfer learning. Proceedings of the 13th ieee international conference on automatic face gesture recognition (FG 2018) (pp.679–686). Xi’an, China 2018, 10.1109/FG.2018.00107
    https://doi.org/10.1109/FG.2018.00107 [Google Scholar]
  38. Hanna, J. E., & Brennan, S. E.
    (2007) Speakers’ eye gaze disambiguates referring expressions early during face-to-face conversation. Journal of Memory and Language, 57(4), 596–615. 10.1016/j.jml.2007.01.008
    https://doi.org/10.1016/j.jml.2007.01.008 [Google Scholar]
  39. Harker, L., & Keltner, D.
    (2001) Expressions of positive emotion in women’s college yearbook pictures and their relationship to personality and life outcomes across adulthood. Journal of Personality and Social Psychology, 80(1), 112–124. 10.1037/0022‑3514.80.1.112
    https://doi.org/10.1037/0022-3514.80.1.112 [Google Scholar]
  40. Heerey, E. A., & Crossley, H. M.
    (2013) Predictive and reactive mechanisms in smile reciprocity. Psychological Science, 24(8), 1446–1455. 10.1177/0956797612472203
    https://doi.org/10.1177/0956797612472203 [Google Scholar]
  41. Holler, J., Schubotz, L., Kelly, S., Hagoort, P., Schuetze, M., & Özyürek, A.
    (2014) Social eye gaze modulates processing of speech and co-speech gesture. Cognition, 133(3), 692–697. 10.1016/j.cognition.2014.08.008
    https://doi.org/10.1016/j.cognition.2014.08.008 [Google Scholar]
  42. Jensen, M. H.
    (2015) Smile as feedback expressions in interpersonal interaction. International Journal of Psychological Studies, 7(4), 95–105. 10.5539/ijps.v7n4p95
    https://doi.org/10.5539/ijps.v7n4p95 [Google Scholar]
  43. Jiang, H., Coskun, M., Badokhon, A., Liu, M., & Huang, M.-C.
    (2019) Hidden smile correlation discovery across subjects using random walk with restart. IEEE Transactions on Affective Computing, 10(1), 76–84. 10.1109/TAFFC.2017.2774278
    https://doi.org/10.1109/TAFFC.2017.2774278 [Google Scholar]
  44. Kendon, A.
    (1967) Some functions of gaze-direction in social interaction. Acta psychologica, 261, 22–63. 10.1016/0001‑6918(67)90005‑4
    https://doi.org/10.1016/0001-6918(67)90005-4 [Google Scholar]
  45. (2004) Gesture: Visible action as utterance. Cambridge: Cambridge University Press. 10.1017/CBO9780511807572
    https://doi.org/10.1017/CBO9780511807572 [Google Scholar]
  46. Kent, A., Berry, M. M., Luehrs Jr., F. U., & Perry, J. W.
    (1955) Machine literature searching VIII. Operational criteria for designing information retrieval systems. American Documentation, 6(2), 93–101. 10.1002/asi.5090060209
    https://doi.org/10.1002/asi.5090060209 [Google Scholar]
  47. Kerbrat-Orecchioni, C., & Cosnier, J.
    (1987) Décrire la conversation. Lyon: Presses universitaires de Lyon.
    [Google Scholar]
  48. Kowdiki, M., & Khaparde, A.
    (2021) Automatic hand gesture recognition using hybrid meta-heuristic-based feature selection and classification with dynamic time warping. Computer Science Review, 39(2), 2–16. 10.1016/j.cosrev.2020.100320
    https://doi.org/10.1016/j.cosrev.2020.100320 [Google Scholar]
  49. Krippendorff, K.
    (2008) Systematic and random disagreement and the reliability of nominal data. Communication Methods and Measures, 2(4), 323–338. 10.1080/19312450802467134
    https://doi.org/10.1080/19312450802467134 [Google Scholar]
  50. Krumhuber, E. G., Likowski, K. U., & Weyers, P.
    (2014) Facial mimicry of spontaneous and deliberate Duchenne and Non-Duchenne smiles. Journal of Nonverbal Behavior, 381, 1–11. 10.1007/s10919‑013‑0167‑8
    https://doi.org/10.1007/s10919-013-0167-8 [Google Scholar]
  51. Landis, J. R., & Koch, G. G.
    (1977) The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–74. 10.2307/2529310
    https://doi.org/10.2307/2529310 [Google Scholar]
  52. Martinez, B., Valstar, M., Jiang, B., & Pantic, M.
    (2019) Automatic analysis of facial actions: A survey. IEEE Transactions on Affective Computing, 10(3), 325–347. 10.1109/TAFFC.2017.2731763
    https://doi.org/10.1109/TAFFC.2017.2731763 [Google Scholar]
  53. McNeill, D.
    (1992) Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press.
    [Google Scholar]
  54. (2012) How language began: Gesture and speech in human evolution. Cambridge: Cambridge University Press. 10.1017/CBO9781139108669
    https://doi.org/10.1017/CBO9781139108669 [Google Scholar]
  55. Powers, D. M. W.
    (2011) Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
    [Google Scholar]
  56. (2012) The problem with kappa. Proceedings of the 13th conference of the European chapter of the association for computational linguistics (EACL’2012) (pp.345–355). Publishers: Association for Computational Linguistics. Avignon, France, April 2012 Retrieved fromdl.acm.org/citation.cfm?id=2380816.2380859 (last access1 August 2023).
    [Google Scholar]
  57. Priego-Valverde, B., Bigi, B., Attardo, S., Pickering, L., & Gironzetti, E.
    (2018) Is smiling during humor so obvious? A cross-cultural comparison of smiling behavior in humorous sequences in American English and French interactions. Intercultural Pragmatics. Published byDe Gruyter MoutonOctober 31, 2018. Retrieved fromhttps://hal.archives-ouvertes.fr/hal-01923442 (last access1 August 2023).
    [Google Scholar]
  58. R Core Team
    R Core Team (2016) R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. Retrieved fromhttps://www.Rproject.org/ (last access1 August 2023).
    [Google Scholar]
  59. Rabiner, L. R.
    (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286. 10.1109/5.18626
    https://doi.org/10.1109/5.18626 [Google Scholar]
  60. Rauzy, S., & Goujon, A.
    (2018) Automatic annotation of facial actions from a video record: The case of eyebrows raising and frowning. Proceedings of the workshop on “Affects, Compagnons Artificiels et Interactions”, WACAI 2018 (7pages). Ed.Magalie Ochs. Porquerolles, France, June 2018 Retrieved fromhttps://hal.archives-ouvertes.fr/hal-01769684 (last access1 August 2023).
    [Google Scholar]
  61. RStudio Team
    RStudio Team (2015) RStudio: Integrated development environment for r. RStudio, Inc. Boston, MA. Retrieved fromhttps://posit.co/ (last access1 August 2023).
    [Google Scholar]
  62. Sacks, H., Schegloff, E., & Jefferson, G.
    (1974) A Simplest Systematics for the Organization of Turn Taking in Conversation. Language, 501, 696–735. 10.1353/lan.1974.0010
    https://doi.org/10.1353/lan.1974.0010 [Google Scholar]
  63. Sanders, A. F.
    (2013) Elements of human performance: Reaction processes and attention in human skill. Psychology Press, London, England, United Kingdom. 10.4324/9780203774250
    https://doi.org/10.4324/9780203774250 [Google Scholar]
  64. Schneider, P., Memmesheimer, R., Kramer, I., & Paulus, D.
    (2019) Gesture recognition in rgb videos using human body keypoints and dynamic time warping. InS. Chalup, T. Niemueller, J. Suthakorn, & M.-A. Williams (Eds.), Robocup 2019: Robot world cup xxiii (pp.281–293). Cham: Springer International Publishing. 10.1007/978‑3‑030‑35699‑6_22
    https://doi.org/10.1007/978-3-030-35699-6_22 [Google Scholar]
  65. Seder, J. P., & Oishi, S.
    (2012) Intensity of smiling in Facebook photos predicts future life satisfaction. Social Psychological and Personality Science, 3(4), 407–413. 10.1177/1948550611424968
    https://doi.org/10.1177/1948550611424968 [Google Scholar]
  66. Seger, R. A., Wanderley, M. M., & Koerich, A. L.
    (2014) Automatic detection of musicians’ ancillary gestures based on video analysis. Expert Systems with Applications, 41(4, Part 2), 2098–2106. 10.1016/j.eswa.2013.09.009
    https://doi.org/10.1016/j.eswa.2013.09.009 [Google Scholar]
  67. Shan, C.
    (2012) Smile detection by boosting pixel differences. IEEE Trans. Image Processing, 21(1), 431–436. 10.1109/TIP.2011.2161587
    https://doi.org/10.1109/TIP.2011.2161587 [Google Scholar]
  68. Shimada, K., Matsukawa, T., Noguchi, Y., & Kurita, T.
    (2010) Appearance-based smile intensity estimation by cascaded support vector machines. Proceedings of the Asian conference on computer vision workshops (pp.277–286). Queenstown, New Zealand, November 2010.
    [Google Scholar]
  69. Sim, J., & Wright, C. C.
    (2005) The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Physical Therapy, 85(3), 257–268. 10.1093/ptj/85.3.257
    https://doi.org/10.1093/ptj/85.3.257 [Google Scholar]
  70. Sloetjes, H., & Wittenburg, P.
    (2008) Annotation by category – ELAN and ISO DCR. Proceedings of the 6th international conference on language resources and evaluation (LREC 2008). Marrakech, Morocco, May 2008 Retrieved fromhttps://tla.mpi.nl/tools/tla-tools/elan/ (last access1 August 2023).
    [Google Scholar]
  71. Vettin, J., & Todt, D.
    (2004) Laughter in conversation: Features of occurrence and acoustic structure. Journal of Nonverbal Behavior, 28(2), 93–115. 10.1023/B:JONB.0000023654.73558.72
    https://doi.org/10.1023/B:JONB.0000023654.73558.72 [Google Scholar]
  72. Vinola, C., & Vimala Devi, K.
    (2019) Smile intensity recognition in real time videos: Fuzzy system approach. Multimedia Tools and Applications, 78(11), 15033–15052. 10.1007/s11042‑018‑6890‑8
    https://doi.org/10.1007/s11042-018-6890-8 [Google Scholar]
  73. Viterbi, A. J.
    (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260–269. 10.1109/TIT.1967.1054010
    https://doi.org/10.1109/TIT.1967.1054010 [Google Scholar]
  74. Walecki, R., Rudovic, O., Pavlovic, V., & Pantic, M.
    (2019) Copula ordinal regression framework for joint estimation of facial action unit intensity. IEEE Transactions on Affective Computing, 10(3), 297–312. 10.1109/TAFFC.2017.2728534
    https://doi.org/10.1109/TAFFC.2017.2728534 [Google Scholar]
  75. Whitehill, J., Littlewort, G., Fasel, I. R., Bartlett, M. S., & Movellan, J. R.
    (2009) Toward practical smile detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 311, 2106–2111. 10.1109/TPAMI.2009.42
    https://doi.org/10.1109/TPAMI.2009.42 [Google Scholar]
  76. Zhang, K., Huang, Y., Wu, H., & Wang, L.
    (2015) Facial smile detection based on deep learning features. 2015. Proceedings of the 3rd Asian Conference on Pattern Recognition (ACPR 2015), 534–538. Kuala Lumpur, Malaysia, November 2015 10.1109/ACPR.2015.7486560
    https://doi.org/10.1109/ACPR.2015.7486560 [Google Scholar]
/content/journals/10.1075/gest.22012.rau
Loading
/content/journals/10.1075/gest.22012.rau
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error