1887
Volume 17, Issue 3
  • ISSN 1871-1340
  • E-ISSN: 1871-1375

Abstract

Abstract

Distributional semantics offers new ways to study the semantics of morphology. This study focuses on the semantics of noun singulars and their plural inflectional variants in English. Our goal is to compare two models for the conceptualization of plurality. One model (FRACSS) proposes that all singular-plural pairs should be taken into account when predicting plural semantics from singular semantics. The other model (CCA) argues that conceptualization for plurality depends primarily on the semantic class of the base word. We compare the two models on the basis of how well the speech signal of plural tokens in a large corpus of spoken American English aligns with the semantic vectors predicted by the two models. Two measures are employed: the performance of a form-to-meaning mapping and the correlations between form distances and meaning distances. Results converge on a superior alignment for CCA. Our results suggest that usage-based approaches to pluralization in which a given word’s own semantic neighborhood is given priority outperform theories according to which pluralization is conceptualized as a process building on high-level abstraction. We see that what has often been conceived of as a highly abstract concept, [+], is better captured via a family of mid-level partial generalizations.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/ml.22011.sha
2023-08-17
2024-12-13
Loading full text...

Full text loading...

/deliver/fulltext/ml.22011.sha.html?itemId=/content/journals/10.1075/ml.22011.sha&mimeType=html&fmt=ahah

References

  1. Amenta, S., Marelli, M., Sulpizio, S.
    (2017) From sound to meaning: Phonology-to-Semantics mapping in visual word recognition. Psychonomic Bulletin and Review, 24 (3), 887–893. 10.3758/s13423‑016‑1152‑0
    https://doi.org/10.3758/s13423-016-1152-0 [Google Scholar]
  2. Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., Blevins, J.
    (2019) The discriminative lexicon: A unified computational mo del for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity, 1–39. 10.1155/2019/4895891
    https://doi.org/10.1155/2019/4895891 [Google Scholar]
  3. Baayen, R. H., & Moscoso del Prado Martín, F.
    (2005) Semantic density and past-tense formation in three Germanic languages. Language, 811, 666–698. 10.1353/lan.2005.0112
    https://doi.org/10.1353/lan.2005.0112 [Google Scholar]
  4. Boleda, G.
    (2020) Distributional Semantics and Linguistic Theory. Annual Review of Linguistics, 61, 213–234. 10.1146/annurev‑linguistics‑011619‑030303arXiv:1905.01896v4.
    https://doi.org/10.1146/annurev-linguistics-011619-030303 [Google Scholar]
  5. Chuang, Y.-Y., Brown, D., Baayen, R. H., Evans, R.
    (2022) Paradigm gaps are associated with weird “distributional semantics" properties: Russian defective nouns and their case and number paradigms. submitted. Retrieved fromhttps://psyarxiv.com/t7xba/download?format=pdf. 10.31234/osf.io/t7xba
    https://doi.org/10.31234/osf.io/t7xba [Google Scholar]
  6. Ciaramita, M., & Johnson, M.
    (2003) Supersense tagging of unknown nouns in wordnet. Proceedings of the 2003 conference on empirical methods in natural language processing (p.168–175). USA: Association for Computational Linguistics. 10.3115/1119355.1119377
    https://doi.org/10.3115/1119355.1119377 [Google Scholar]
  7. Corbett, G. G.
    (2000) Number (S. R. Anderson , Eds.). Cambridge, UK: Cambridge University Press. 10.1017/CBO9781139164344
    https://doi.org/10.1017/CBO9781139164344 [Google Scholar]
  8. Faraway, J. J.
    (2005) Linear models with r. Boca Raton, FL: Chapman & Hall/CRC. Retrieved fromwww.stat.lsa.umich.edu/faraway/LMR/
    [Google Scholar]
  9. Fellbaum, C.
    (1998) WordNet: An electronic lexical database. Cambridge, MA: MIT Press. 10.7551/mitpress/7287.001.0001
    https://doi.org/10.7551/mitpress/7287.001.0001 [Google Scholar]
  10. Firth, J. R.
    (1968) Selected papers of J. R. Firth, 1952–59. Indiana University Press.
    [Google Scholar]
  11. Gallice, G.
    (2012) Flickr – ggallice – street dogs (1). Wikimedia Commons. Retrieved2022-5-31, from https://commons.wikimedia.org/wiki/File:Flickr_-_ggallice_-_Street_dogs_(1).jpg (This file is licensed under the Creative Commons Attribution 2.0 Generic license.)
    [Google Scholar]
  12. Günther, F., Rinaldi, L., Marelli, M.
    (2019) Vector-Space Models of Semantic Representation From a Cognitive Perspective: A Discussion of Common Misconceptions. Perspectives on Psychological Science, 14 (6), 1006–1033. 10.1177/1745691619861372
    https://doi.org/10.1177/1745691619861372 [Google Scholar]
  13. Harbour, D.
    (2008) Morphosemantic Number: From Kiowa Noun Classes To UG Number Features (1st ed.). Dordrecht: Springer. 10.1007/978‑1‑4020‑5038‑1
    https://doi.org/10.1007/978-1-4020-5038-1 [Google Scholar]
  14. (2011) Valence and atomic number. Linguistic Inquiry, 42 (4), 561–594. 10.1162/LING_a_00061
    https://doi.org/10.1162/LING_a_00061 [Google Scholar]
  15. Harris, Z. S.
    (1954, 8). Distributional Structure. WORD, 10 (2–3). 10.1080/00437956.1954.11659520
    https://doi.org/10.1080/00437956.1954.11659520 [Google Scholar]
  16. Johnson, K.
    (2004) Massive reduction in conversational American English. Spontaneous speech: data and analysis. proceedings of the 1st session of the 10th international symposium (pp.29–54). Tokyo, Japan.
    [Google Scholar]
  17. Khursheed, O.
    (2014) Apples of kashmir valley. Wikimedia Commons. Retrieved2022-5-31, from https://commons.wikimedia.org/wiki/File:Apples_of_Kashmir_valley.jpg (This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.)
    [Google Scholar]
  18. Kiela, D., Bulat, L., Clark, S.
    (2015) Grounding semantics in olfactory perception. ACL-IJCNLP 2015 – 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference, 21, 231–236. 10.3115/v1/P15‑2038
    https://doi.org/10.3115/v1/P15-2038 [Google Scholar]
  19. Kiela, D., & Clark, S.
    (2017) Learning neural audio embeddings for grounding semantics in auditory perception. Journal of Artificial Intelligence Research, 601, 1003–1030. 10.1613/jair.5665
    https://doi.org/10.1613/jair.5665 [Google Scholar]
  20. Kisselew, M., Padó, S., Palmer, A., Šnajder, J.
    (2015, April). Obtaining a better understanding of distributional models of German derivational morphology. (pp.58–63). London, UK: Association for Computational Linguistics.
    [Google Scholar]
  21. Landauer, T. K., & Dumais, S. T.
    (1997) A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review, 104 (2), 211–240. 10.1037/0033‑295X.104.2.211.
    https://doi.org/10.1037/0033-295X.104.2.211 [Google Scholar]
  22. Levy, O., Kenett, Y. N., Oxenberg, O., Castro, N., De Deyne, S., Vitevitch, M. S., Havlin, S.
    (2021) Unveiling the nature of interaction between semantics and phonology in lexical access based on multilayer networks. Scientific Reports, 11 (1), 1–14. 10.1038/s41598‑021‑93925‑y.
    https://doi.org/10.1038/s41598-021-93925-y [Google Scholar]
  23. Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.
    (2014) The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of 52nd annual meeting of the association for computational linguistics: System demonstrations (pp.55–60). Baltimore, Maryland: Association for Computational Linguistics. 10.3115/v1/P14‑5010
    https://doi.org/10.3115/v1/P14-5010 [Google Scholar]
  24. Marelli, M., Amenta, S., Crepaldi, D.
    (2015) Semantic Transparency in Free Stems: The Effect of Orthography-Semantics Consistency on Word Recognition. Quarterly Journal of Experimental Psychology, 68 (8), 1571–1583. 10.1080/17470218.2014.959709
    https://doi.org/10.1080/17470218.2014.959709 [Google Scholar]
  25. Marelli, M., & Baroni, M.
    (2015) Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122 (3), 485–515. 10.1037/a0039267
    https://doi.org/10.1037/a0039267 [Google Scholar]
  26. Mikolov, T.
    (2013, Jul30). word2vec. Google Code Archive. Retrieved2021-05-28, from https://code.google.com/archive/p/word2vec/
    [Google Scholar]
  27. Mikolov, T., Chen, K., Corrado, G., Dean, J.
    (2013) Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 – Workshop Track Proceedings, 1–12. arXiv:1301.3781.
    [Google Scholar]
  28. Milin, P., Filipović Durdević, D., Moscoso del Prado Martín, F.
    (2009) The simultaneous effects of inflectional paradigms and classes on lexical recognition: Evidence from Serbian. Journal of Memory and Language, 60 (1), 50–64. 10.1016/j.jml.2008.08.007
    https://doi.org/10.1016/j.jml.2008.08.007 [Google Scholar]
  29. Miller, G. A.
    (1995) WordNet: A lexical database for English. Communications of the ACM, 38 (11), 39–41. 10.1145/219717.219748
    https://doi.org/10.1145/219717.219748 [Google Scholar]
  30. Monaghan, P., Shillcock, R. C., Christiansen, M. H., Kirby, S.
    (2014) How arbitrary is language?Philosophical Transactions of the Royal Society B: Biological Sciences, 369 (1651). 10.1098/rstb.2013.0299
    https://doi.org/10.1098/rstb.2013.0299 [Google Scholar]
  31. Moore, E. H.
    (1920) On the reciprocal of the general algebraic matrix. Bulletin of the American Mathematical Society, 26 (9), 394–395.
    [Google Scholar]
  32. Moscoso del Prado Martín, F., Kostić, A., Baayen, R. H.
    (2004) Putting the bits together: An information theoretical perspective on morphological processing. Cognition, 941, 1–18. 10.1016/j.cognition.2003.10.015
    https://doi.org/10.1016/j.cognition.2003.10.015 [Google Scholar]
  33. Nikolaev, A., Chuang, Y., Baayen, R. H.
    (2022) A generating model for finnish nominal inflection using distributional semantics. Accepted for publication in the Mental Lexicon. Retrieved fromhttps://psyarxiv.com/ndtv2/download. 10.31234/osf.io/ndtv2
    https://doi.org/10.31234/osf.io/ndtv2 [Google Scholar]
  34. Ochshorn, R. M., & Hawkins, M.
    (2015) Gentle: A robust yet lenient forced aligner built on kaldi. (Available online atlowerquality.com/gentle).
    [Google Scholar]
  35. Park, J. A.
    (2013) Spanish racoon cats. Wikimedia Commons. Retrieved2022-5-31, from https://commons.wikimedia.org/wiki/File:Spanish_racoon_cats.JPG (This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.).
    [Google Scholar]
  36. Plag, I., Homann, J., Kunter, G.
    (2017) Homophony and morphology: The acoustics of word-final S in English. Journal of Linguistics, 53 (1), 181–216. 10.1017/S0022226715000183
    https://doi.org/10.1017/S0022226715000183 [Google Scholar]
  37. Polomé, E. C.
    (1967) Swahili language handbook. Washington, D.C.: Center for Applied Linguistics.
    [Google Scholar]
  38. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., … Vesely, K.
    (2011, December). The kaldi speech recognition toolkit. IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society.
    [Google Scholar]
  39. Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., Baayen, R. H.
    (2021) LDL-AURIS: a computational model, grounded in error-driven learning, for the comprehension of single spoken words. Language, Cognition and Neuroscience. 10.1080/23273798.2021.1954207
    https://doi.org/10.1080/23273798.2021.1954207 [Google Scholar]
  40. (2022) Semantic properties of english nominal pluralization: Insights from word embeddings. arXiv. arXiv:2203.15424.
    [Google Scholar]
  41. Shahmohammadi, H., Lensch, H. P. A., Baayen, R. H.
    (2021, November). Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. Proceedings of the 25th conference on computational natural language learning (pp.158–170). Online: Association for Computational Linguistics. 10.18653/v1/2021.conll‑1.12
    https://doi.org/10.18653/v1/2021.conll-1.12 [Google Scholar]
  42. Shillcock, R., Kirby, S., McDonald, S., Brew, C.
    (2001) Filled pauses and their status in the mental lexicon. Proc. ITRW on Disfluency in Spontaneous Speech, Edinburgh, UK, 29–31 August 2001 (DiSS 2001) (pp.53–56). Edinburgh, UK: International Speech Communication Association.
    [Google Scholar]
  43. Siegelman, N., Rueckl, J. G., Lo, J. C. M., Kearns, D. M., Morris, R. D., Compton, D. L.
    (2022) Quantifying the regularities between orthography and semantics and their impact on group- and individual-level behavior. Journal of Experimental Psychology: Learning Memory and Cognition, 48 (6), 839–855. 10.1037/xlm0001109
    https://doi.org/10.1037/xlm0001109 [Google Scholar]
  44. Sinclair, J.
    (1991) Corpus, concordance, collocation. Oxford: Oxford University Press.
    [Google Scholar]
  45. Tamariz, M.
    (2008) Exploring systematicity between phonological and context-cooccurrence representations of the mental lexicon. The Mental Lexicon, 3(2), 259–278. 10.1075/ml.3.2.05tam.
    https://doi.org/10.1075/ml.3.2.05tam [Google Scholar]
  46. Tomaschek, F., Plag, I., Ernestus, M., Baayen, R. H.
    (2019) Modeling the duration of word-final s in english with naive discriminative learning. Journal of Linguistics. (https://psyarxiv.com/4bmwg)10.31234/osf.io/4bmwg
    https://doi.org/10.31234/osf.io/4bmwg [Google Scholar]
  47. Uhrig, P.
    (2018) Newsscape and the distributed little red hen lab – a digital infrastructure for the large-scale analysis of tv broadcasts. A.-J. Zwierlein, J. Petzold, K. Böhm, & M. Decker (Eds.), Anglistentag 2017 in regensburg: Proceedings. proceedings of the conference of the german association of university teachers of english (pp.99–114). Trier: Wissenschaftlicher Verlag Trier.
    [Google Scholar]
  48. (2021) Large-Scale Multimodal Corpus Linguistics – The Big Data Turn (Habilitation thesis, unpublished manuscript). FAU Erlangen-Nürnberg.
    [Google Scholar]
  49. van der Maaten, L., & Hinton, G.
    (2008) Visualizing Data using t-SNE. Journal of Machine Learning Research, 9 (86), 2579–2605. Retrieved fromjmlr.org/papers/v9/vandermaaten08a.html
    [Google Scholar]
  50. Vyagov, V.
    (2021) Oranges (fruits). Wikimedia Commons. Retrieved2022-5-31, fromhttps://commons.wikimedia.org/wiki/File:Oranges_(fruits).jpg (This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.).
    [Google Scholar]
  51. Wang, B., Wang, A., Chen, F., Wang, Y., Kuo, C. C. J.
    (2019) Evaluating word embedding models: Methods and experimental results. APSIPA Transactions on Signal and Information Processing, 8 (1), e19. 10.1017/ATSIP.2019.12
    https://doi.org/10.1017/ATSIP.2019.12 [Google Scholar]
  52. Yip, P.-C., & Rimmington, D.
    (2006) Chinese: An essential grammar. Routledge.
    [Google Scholar]
/content/journals/10.1075/ml.22011.sha
Loading
/content/journals/10.1075/ml.22011.sha
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error