1887
Volume 18, Issue 3
  • ISSN 1871-1340
  • E-ISSN: 1871-1375

Abstract

Abstract

Günther et al. (2022) investigated the relationship between words and images in which they concluded the possibility of a direct link between words and embodied experience. In their study, participants were presented with a target noun and a pair of images, one chosen by their model and another chosen randomly. Participants were asked to select the image that best matched the target noun. Building upon their work, we addressed the following questions. 1. Apart from utilizing visually embodied simulation, what other strategies subjects might have used? How much does this setup rely on visual information? Can it be solved using textual representations? 2. Do current visually-grounded embeddings explain subjects’ selection behavior better than textual embeddings? 3. Does visual grounding improve the representations of both concrete and abstract words? For this aim, we designed novel experiments based on pre-trained word embeddings. Our experiments reveal that subjects’ selection behavior is explained to a large extend on text-based embeddings and word-based similarities. Visually grounded embeddings offered modest advantages over textual embeddings in certain cases. These findings indicate that the experiment by Günther et al. (2022) may not be well suited for tapping into the perceptual experience of participants, and the extent to which it measures visually grounded knowledge is unclear.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/ml.22010.sha
2024-01-11
2024-10-09
Loading full text...

Full text loading...

/deliver/fulltext/ml.22010.sha.html?itemId=/content/journals/10.1075/ml.22010.sha&mimeType=html&fmt=ahah

References

  1. Abdou, M., Kulmizev, A., Hershcovich, D., Frank, S., Pavlick, E., and Søgaard, A.
    (2021) Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color. InProceedings of the 25th Conference on Computational Natural Language Learning, pages109–132, Stroudsburg, PA, USA. Association for Computational Linguistics. 10.18653/v1/2021.conll‑1.9
    https://doi.org/10.18653/v1/2021.conll-1.9 [Google Scholar]
  2. Anderson, A. J., Bruni, E., Lopopolo, A., Poesio, M., and Baroni, M.
    (2015) Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text. NeuroImage, 1201:309–322. 10.1016/j.neuroimage.2015.06.093
    https://doi.org/10.1016/j.neuroimage.2015.06.093 [Google Scholar]
  3. Anschütz, M., Lozano, D. M., and Groh, G.
    (2023) This is not correct! negation-aware evaluation of language generation systems. 10.18653/v1/2023.inlg‑main.12
    https://doi.org/10.18653/v1/2023.inlg-main.12 [Google Scholar]
  4. Baroni, M.
    (2016) Grounding distributional semantics in the visual world. Language and Linguistics Compass, 10(1):3–13. 10.1111/lnc3.12170
    https://doi.org/10.1111/lnc3.12170 [Google Scholar]
  5. Barsalou, L. W.
    (1999) Perceptual symbol systems. Behavioral and Brain Sciences, 22(4). 10.1017/S0140525X99002149
    https://doi.org/10.1017/S0140525X99002149 [Google Scholar]
  6. (2003) Abstraction in perceptual symbol systems. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1435). 10.1098/rstb.2003.1319
    https://doi.org/10.1098/rstb.2003.1319 [Google Scholar]
  7. (2008) Grounded Cognition. Annual Review of Psychology, 59(1). 10.1146/annurev.psych.59.103006.093639
    https://doi.org/10.1146/annurev.psych.59.103006.093639 [Google Scholar]
  8. (2010) Grounded cognition: Past, present, and future. Topics in cognitive science, 2(4):716–724. 10.1111/j.1756‑8765.2010.01115.x
    https://doi.org/10.1111/j.1756-8765.2010.01115.x [Google Scholar]
  9. Barsalou, L. W., Santos, A., Simmons, W. K., and Wilson, C. D.
    (2008) Language and simulation in conceptual processing. InSymbols and Embodiment: Debates on meaning and cognition. Oxford University Press. 10.1093/acprof:oso/9780199217274.003.0013
    https://doi.org/10.1093/acprof:oso/9780199217274.003.0013 [Google Scholar]
  10. Bordes, P., Zablocki, E., Soulier, L., Piwowarski, B., and Gallinari, P.
    (2019) Incorporating visual semantics into sentence representations within a grounded space. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages696–707, Hong Kong, China. Association for Computational Linguistics. 10.18653/v1/D19‑1064
    https://doi.org/10.18653/v1/D19-1064 [Google Scholar]
  11. Bruni, E., Tran, N.-K., and Baroni, M.
    (2014) Multimodal distributional semantics. Journal of Artificial Intelligence Research, 491:1–47. 10.1613/jair.4135
    https://doi.org/10.1613/jair.4135 [Google Scholar]
  12. Brysbaert, M., Warriner, A. B., and Kuperman, V.
    (2014) Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3):904–911. 10.3758/s13428‑013‑0403‑5
    https://doi.org/10.3758/s13428-013-0403-5 [Google Scholar]
  13. Buchanan, E. M., Valentine, K. D., and Maxwell, N. P.
    (2019) English semantic feature production norms: An extended database of 4436 concepts. Behavior Research Methods, 51(4). 10.3758/s13428‑019‑01243‑z
    https://doi.org/10.3758/s13428-019-01243-z [Google Scholar]
  14. Bulat, L., Clark, S., and Shutova, E.
    (2017) Speaking, Seeing, Understanding: Correlating semantic models with conceptual representation in the brain. InProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA. Association for Computational Linguistics. 10.18653/v1/D17‑1113
    https://doi.org/10.18653/v1/D17-1113 [Google Scholar]
  15. Castelhano, M. S. and Rayner, K.
    (2008) Eye movements during reading, visual search, and scene perception: An overview. Cognitive and cultural influences on eye movements, 21751:3–33.
    [Google Scholar]
  16. Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A.
    (2014) Return of the Devil in the Details: Delving Deep into Convolutional Nets. arXiv preprint arXiv:1405.3531. 10.5244/C.28.6
    https://doi.org/10.5244/C.28.6 [Google Scholar]
  17. Chrupaɫa, G., Kádár, Á., and Alishahi, A.
    (2015) Learning language through pictures. InProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages112–118, Beijing, China. Association for Computational Linguistics. 10.3115/v1/P15‑2019
    https://doi.org/10.3115/v1/P15-2019 [Google Scholar]
  18. Collell Talleda, G., Zhang, T., and Moens, M.-F.
    (2017) Imagined visual representations as multimodal embeddings. InProceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pages4378–4384. AAAI. 10.1609/aaai.v31i1.11155
    https://doi.org/10.1609/aaai.v31i1.11155 [Google Scholar]
  19. Cree, G. S. and McRae, K.
    (2003) Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132(2).
    [Google Scholar]
  20. Cronin, D. A., Hall, E. H., Goold, J. E., Hayes, T. R., and Henderson, J. M.
    (2020) Eye movements in real-world scene photographs: General characteristics and effects of viewing task. Frontiers in Psychology, 101:2915. 10.3389/fpsyg.2019.02915
    https://doi.org/10.3389/fpsyg.2019.02915 [Google Scholar]
  21. De Deyne, S., Navarro, D. J., Collell, G., and Perfors, A.
    (2021) Visual and Affective Multimodal Models of Word Meaning in Language and Mind. Cognitive Science, 45(1). 10.1111/cogs.12922
    https://doi.org/10.1111/cogs.12922 [Google Scholar]
  22. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L.
    (2009) Imagenet: A largescale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages248–255. Ieee. 10.1109/CVPR.2009.5206848
    https://doi.org/10.1109/CVPR.2009.5206848 [Google Scholar]
  23. Dolan, R. J.
    (2002) Emotion, cognition, and behavior. Science, 298(5596):1191–1194. 10.1126/science.1076358
    https://doi.org/10.1126/science.1076358 [Google Scholar]
  24. Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E.
    (2001) Placing search in context: The concept revisited. InProceedings of the 10th international conference on World Wide Web, pages406–414. 10.1145/371920.372094
    https://doi.org/10.1145/371920.372094 [Google Scholar]
  25. Gerz, D., Vulić, I., Hill, F., Reichart, R., and Korhonen, A.
    (2016) SimVerb-3500: A large-scale evaluation set of verb similarity. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages2173–2182, Austin, Texas. Association for Computational Linguistics. 10.18653/v1/D16‑1235
    https://doi.org/10.18653/v1/D16-1235 [Google Scholar]
  26. Goldstone, R. L.
    (1995) Effects of Categorization on Color Perception. Psychological Science, 6(5). 10.1111/j.1467‑9280.1995.tb00514.x
    https://doi.org/10.1111/j.1467-9280.1995.tb00514.x [Google Scholar]
  27. Grondin, R., Lupker, S. J., and McRae, K.
    (2009) Shared features dominate semantic richness effects for concrete concepts. Journal of Memory and Language, 60(1):1–19. 10.1016/j.jml.2008.09.001
    https://doi.org/10.1016/j.jml.2008.09.001 [Google Scholar]
  28. Günther, F., Petilli, M. A., Vergallito, A., and Marelli, M.
    (2022) Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model. Psychological Research. 10.1007/s00426‑020‑01429‑7
    https://doi.org/10.1007/s00426-020-01429-7 [Google Scholar]
  29. Günther, F., Rinaldi, L., and Marelli, M.
    (2019) Vector-Space Models of Semantic Representation From a Cognitive Perspective: A Discussion of Common Misconceptions. Perspectives on Psychological Science, 14(6):1006–1033. 10.1177/1745691619861372
    https://doi.org/10.1177/1745691619861372 [Google Scholar]
  30. Halawi, G., Dror, G., Gabrilovich, E., and Koren, Y.
    (2012) Large-scale learning of word relatedness with constraints. InProceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages1406–1414. 10.1145/2339530.2339751
    https://doi.org/10.1145/2339530.2339751 [Google Scholar]
  31. Harnad, S.
    (1990) The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1–3):335–346. 10.1016/0167‑2789(90)90087‑6
    https://doi.org/10.1016/0167-2789(90)90087-6 [Google Scholar]
  32. Harris, Z. S.
    (1954) Distributional Structure. WORD, 10(2–3). 10.1080/00437956.1954.11659520
    https://doi.org/10.1080/00437956.1954.11659520 [Google Scholar]
  33. Hasegawa, M., Kobayashi, T., and Hayashi, Y.
    (2017) Incorporating visual features into word embeddings: A bimodal autoencoder-based approach. InIWCS 2017 – 12th International Conference on Computational Semantics – Short papers.
    [Google Scholar]
  34. Hill, F., Reichart, R., and Korhonen, A.
    (2015) Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics, 41(4):665–695. 10.1162/COLI_a_00237
    https://doi.org/10.1162/COLI_a_00237 [Google Scholar]
  35. Hochreiter, S. and Schmidhuber, J.
    (1997) Long short-term memory. Neural computation, 9(8):1735–1780. 10.1162/neco.1997.9.8.1735
    https://doi.org/10.1162/neco.1997.9.8.1735 [Google Scholar]
  36. Hoffman, D.
    (2019) The case against reality: Why evolution hid the truth from our eyes. WW Norton & Company.
    [Google Scholar]
  37. Hollenstein, N., de la Torre, A., Langer, N., and Zhang, C.
    (2019) CogniVal: A Framework for Cognitive Word Embedding Evaluation. InProceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Stroudsburg, PA, USA. Association for Computational Linguistics. 10.18653/v1/K19‑1050
    https://doi.org/10.18653/v1/K19-1050 [Google Scholar]
  38. Howell, S. R., Jankowicz, D., and Becker, S.
    (2005) A model of grounded language acquisition: Sensorimotor features improve lexical and grammatical learning. Journal of Memory and Language, 53(2):258–276. 10.1016/j.jml.2005.03.002
    https://doi.org/10.1016/j.jml.2005.03.002 [Google Scholar]
  39. Kant, I., Guyer, P., and Wood, A. W.
    (1781/1999) Critique of pure reason. Cambridge University Press.
    [Google Scholar]
  40. Kenton, J. D. M.-W. C. and Toutanova, L. K.
    (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of NAACL-HLT, pages4171–4186.
    [Google Scholar]
  41. Kiela, D. and Bottou, L.
    (2014) Learning image embeddings using convolutional neural networks for improved multi-modal semantics. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages36–45, Doha, Qatar. Association for Computational Linguistics. 10.3115/v1/D14‑1005
    https://doi.org/10.3115/v1/D14-1005 [Google Scholar]
  42. Kiela, D., Bulat, L., and Clark, S.
    (2015) Grounding semantics in olfactory perception. InProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages231–236. 10.3115/v1/P15‑2038
    https://doi.org/10.3115/v1/P15-2038 [Google Scholar]
  43. Kiela, D. and Clark, S.
    (2015) Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception. InProceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA. Association for Computational Linguistics. 10.18653/v1/D15‑1293
    https://doi.org/10.18653/v1/D15-1293 [Google Scholar]
  44. Kiela, D., Conneau, A., Jabri, A., and Nickel, M.
    (2018) Learning visually grounded sentence representations. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages408–418, New Orleans, Louisiana. Association for Computational Linguistics. 10.18653/v1/N18‑1038
    https://doi.org/10.18653/v1/N18-1038 [Google Scholar]
  45. Kiros, J., Chan, W., and Hinton, G.
    (2018) Illustrative language understanding: Largescale visual grounding with image search. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages922–933, Melbourne, Australia. Association for Computational Linguistics. 10.18653/v1/P18‑1085
    https://doi.org/10.18653/v1/P18-1085 [Google Scholar]
  46. Lakoff, G.
    (1987) Women, Fire, and Dangerous Things. University of Chicago Press. 10.7208/chicago/9780226471013.001.0001
    https://doi.org/10.7208/chicago/9780226471013.001.0001 [Google Scholar]
  47. Lakoff, G. and Johnson, M.
    (1980) The metaphorical structure of the human conceptual system. Cognitive science, 4(2):195–208. 10.1207/s15516709cog0402_4
    https://doi.org/10.1207/s15516709cog0402_4 [Google Scholar]
  48. Landauer, T. K. and Dumais, S. T.
    (1997) A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2). 10.1037/0033‑295X.104.2.211
    https://doi.org/10.1037/0033-295X.104.2.211 [Google Scholar]
  49. Langacker, R. W.
    (1999) A view from cognitive linguistics. Behavioral and Brain Sciences, 22(4). 10.1017/S0140525X99392141
    https://doi.org/10.1017/S0140525X99392141 [Google Scholar]
  50. Lazaridou, A., Chrupaɫa, G., Fernández, R., and Baroni, M.
    (2016) Multimodal Semantic Learning from Child-Directed Input. InProceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Stroudsburg, PA, USA. Association for Computational Linguistics. 10.18653/v1/N16‑1043
    https://doi.org/10.18653/v1/N16-1043 [Google Scholar]
  51. Lazaridou, A., Marelli, M., and Baroni, M.
    (2017) Multimodal Word Meaning Induction From Minimal Exposure to Natural Text. Cognitive Science, 411. 10.1111/cogs.12481
    https://doi.org/10.1111/cogs.12481 [Google Scholar]
  52. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L.
    (2014) Microsoft coco: Common objects in context. InEuropean conference on computer vision, pages740–755. Springer. 10.1007/978‑3‑319‑10602‑1_48
    https://doi.org/10.1007/978-3-319-10602-1_48 [Google Scholar]
  53. Louwerse, M. and Connell, L.
    (2011) A Taste of Words: Linguistic Context and Perceptual Simulation Predict the Modality of Words. Cognitive Science, 35(2):381–398. 10.1111/j.1551‑6709.2010.01157.x
    https://doi.org/10.1111/j.1551-6709.2010.01157.x [Google Scholar]
  54. Louwerse, M. M.
    (2011) Symbol interdependency in symbolic and embodied cognition. Topics in Cognitive Science, 3(2):273–302. 10.1111/j.1756‑8765.2010.01106.x
    https://doi.org/10.1111/j.1756-8765.2010.01106.x [Google Scholar]
  55. Louwerse, M. M. and Zwaan, R. A.
    (2009) Language Encodes Geographical Information. Cognitive Science, 33(1):51–73. 10.1111/j.1551‑6709.2008.01003.x
    https://doi.org/10.1111/j.1551-6709.2008.01003.x [Google Scholar]
  56. Luong, T., Socher, R., and Manning, C.
    (2013) Better word representations with recursive neural networks for morphology. InProceedings of the Seventeenth Conference on Computational Natural Language Learning, pages104–113, Sofia, Bulgaria. Association for Computational Linguistics.
    [Google Scholar]
  57. Lynott, D., Connell, L., Brysbaert, M., Brand, J., and Carney, J.
    (2020) The Lancaster Sensorimotor Norms: multidimensional measures of perceptual and action strength for 40,000 English words. Behavior Research Methods, 52(3). 10.3758/s13428‑019‑01316‑z
    https://doi.org/10.3758/s13428-019-01316-z [Google Scholar]
  58. Mandera, P., Keuleers, E., and Brysbaert, M.
    (2017) Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language, 921. 10.1016/j.jml.2016.04.001
    https://doi.org/10.1016/j.jml.2016.04.001 [Google Scholar]
  59. Marelli, M. and Amenta, S.
    (2018) A database of orthography-semantics consistency (osc) estimates for 15,017 english words. Behavior research methods, 501:1482–1495. 10.3758/s13428‑018‑1017‑8
    https://doi.org/10.3758/s13428-018-1017-8 [Google Scholar]
  60. Martin, A.
    (2007) The Representation of Object Concepts in the Brain. Annual Review of Psychology, 58(1):25–45. 10.1146/annurev.psych.57.102904.190143
    https://doi.org/10.1146/annurev.psych.57.102904.190143 [Google Scholar]
  61. McRae, K., Cree, G. S., Seidenberg, M. S., and Mcnorgan, C.
    (2005) Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4). 10.3758/BF03192726
    https://doi.org/10.3758/BF03192726 [Google Scholar]
  62. Mikolov, T., Chen, K., Corrado, G., and Dean, J.
    (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
    [Google Scholar]
  63. Mkrtychian, N., Blagovechtchenski, E., Kurmakaeva, D., Gnedykh, D., Kostromina, S., and Shtyrov, Y.
    (2019) Concrete vs. Abstract Semantics: From Mental Representations to Functional Brain Mapping. Frontiers in Human Neuroscience, 131(August):267. 10.3389/fnhum.2019.00267
    https://doi.org/10.3389/fnhum.2019.00267 [Google Scholar]
  64. Montefinese, M.
    (2019) Semantic representation of abstract and concrete words: A minireview of neural evidence. Journal of Neurophysiology, 121(5):1585–1587. 10.1152/jn.00065.2019
    https://doi.org/10.1152/jn.00065.2019 [Google Scholar]
  65. Park, J. and Myaeng, S.-h.
    (2017a) A computational study on word meanings and their distributed representations via polymodal embedding. InProceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages214–223, Taipei, Taiwan. Asian Federation of Natural Language Processing.
    [Google Scholar]
  66. (2017b) A computational study on word meanings and their distributed representations via polymodal embedding. InProceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages214–223.
    [Google Scholar]
  67. Pennington, J., Socher, R., and Manning, C.
    (2014) Glove: Global Vectors for Word Representation. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Stroudsburg, PA, USA. Association for Computational Linguistics. 10.3115/v1/D14‑1162
    https://doi.org/10.3115/v1/D14-1162 [Google Scholar]
  68. Pezzelle, S., Takmaz, E., and Fernández, R.
    (2021) Word representation learning in multimodal pre-trained transformers: An intrinsic evaluation. Transactions of the Association for Computational Linguistics, 91:1563–1579. 10.1162/tacl_a_00443
    https://doi.org/10.1162/tacl_a_00443 [Google Scholar]
  69. Rotaru, A. S. and Vigliocco, G.
    (2020a) Constructing semantic models from words, images, and emojis. Cognitive science, 44(4):e12830. 10.1111/cogs.12830
    https://doi.org/10.1111/cogs.12830 [Google Scholar]
  70. (2020b) Constructing Semantic Models From Words, Images, and Emojis. Cognitive Science, 44(4):e12830. 10.1111/cogs.12830
    https://doi.org/10.1111/cogs.12830 [Google Scholar]
  71. Rozenkrants, B., Olofsson, J. K., and Polich, J.
    (2008) Affective visual event-related potentials: arousal, valence, and repetition effects for normal and distorted pictures. International Journal of Psychophysiology, 67(2):114–123.
    [Google Scholar]
  72. Shahmohammadi, H., Heitmeier, M., Shafaei-Bajestan, E., Lensch, H., and Baayen, H.
    (2023) Language with vision: a study on grounded word and sentence embeddings. Behavior Research Methods, accepted for publication. 10.3758/s13428‑023‑02294‑z
    https://doi.org/10.3758/s13428-023-02294-z [Google Scholar]
  73. Shahmohammadi, H., Lensch, H. P. A., and Baayen, R. H.
    (2021) Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. InProceedings of the 25th Conference on Computational Natural Language Learning, pages158–170, Online. Association for Computational Linguistics. 10.18653/v1/2021.conll‑1.12
    https://doi.org/10.18653/v1/2021.conll-1.12 [Google Scholar]
  74. Silberer, C. and Lapata, M.
    (2014) Learning grounded meaning representations with autoencoders. InProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages721–732, Baltimore, Maryland. Association for Computational Linguistics. 10.3115/v1/P14‑1068
    https://doi.org/10.3115/v1/P14-1068 [Google Scholar]
  75. Simmons, W. K., Martin, A., and Barsalou, L. W.
    (2005) Pictures of Appetizing Foods Activate Gustatory Cortices for Taste and Reward. Cerebral Cortex, 15(10):1602–1608. 10.1093/cercor/bhi038
    https://doi.org/10.1093/cercor/bhi038 [Google Scholar]
  76. Solomon, K. O. and Barsalou, L. W.
    (2001) Representing Properties Locally. Cognitive Psychology, 43(2):129–169. 10.1006/cogp.2001.0754
    https://doi.org/10.1006/cogp.2001.0754 [Google Scholar]
  77. (2004) Perceptual simulation in property verification. Memory & Cognition, 32(2):244–259. 10.3758/BF03196856
    https://doi.org/10.3758/BF03196856 [Google Scholar]
  78. Tan, H. and Bansal, M.
    (2020) Vokenization: Improving language understanding with contextualized, visual-grounded supervision. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages2066–2080, Online. Association for Computational Linguistics. 10.18653/v1/2020.emnlp‑main.162
    https://doi.org/10.18653/v1/2020.emnlp-main.162 [Google Scholar]
  79. Tan, M. and Le, Q.
    (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational conference on machine learning, pages6105–6114. PMLR.
    [Google Scholar]
  80. Utsumi, A.
    (2022) A test of indirect grounding of abstract concepts using multimodal distributional semantics. Frontiers in psychology, 131. 10.3389/fpsyg.2022.906181
    https://doi.org/10.3389/fpsyg.2022.906181 [Google Scholar]
  81. Vigliocco, G., Ponari, M., and Norbury, C.
    (2018) Learning and processing abstract words and concepts: Insights from typical and atypical development. Topics in cognitive science, 10(3):533–549. 10.1111/tops.12347
    https://doi.org/10.1111/tops.12347 [Google Scholar]
  82. Wang, B., Wang, A., Chen, F., Wang, Y., and Kuo, C.-C. J.
    (2019) Evaluating word embedding models: Methods and experimental results. APSIPA transactions on signal and information processing, 81. 10.1017/ATSIP.2019.12
    https://doi.org/10.1017/ATSIP.2019.12 [Google Scholar]
  83. Westbury, C.
    (2014) You Can’t Drink a Word: Lexical and Individual Emotionality Affect Subjective Familiarity Judgments. Journal of Psycholinguistic Research, 43(5). 10.1007/s10936‑013‑9266‑2
    https://doi.org/10.1007/s10936-013-9266-2 [Google Scholar]
  84. Westbury, C. and Hollis, G.
    (2019) Wriggly, squiffy, lummox, and boobs: What makes some words funny?Journal of Experimental Psychology: General, 148(1).
    [Google Scholar]
  85. Wood, S. N.
    (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B), 73(1):3–36. 10.1111/j.1467‑9868.2010.00749.x
    https://doi.org/10.1111/j.1467-9868.2010.00749.x [Google Scholar]
  86. Yun, T., Sun, C., and Pavlick, E.
    (2021) Does vision-and-language pretraining improve lexical grounding?InFindings of the Association for Computational Linguistics: EMNLP 2021, pages4357–4366, Punta Cana, Dominican Republic. Association for Computational Linguistics. 10.18653/v1/2021.findings‑emnlp.370
    https://doi.org/10.18653/v1/2021.findings-emnlp.370 [Google Scholar]
  87. Zwaan, R. A. and Madden, C. J.
    (2005) Embodied Sentence Comprehension. InGrounding Cognition. Cambridge University Press. 10.1017/CBO9780511499968.010
    https://doi.org/10.1017/CBO9780511499968.010 [Google Scholar]
/content/journals/10.1075/ml.22010.sha
Loading
/content/journals/10.1075/ml.22010.sha
Loading

Data & Media loading...

  • Article Type: Research Article
Keyword(s): grounded cognition; visual grounding; word embeddings
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error