1887
Volume 17, Issue 3
  • ISSN 1871-1340
  • E-ISSN: 1871-1375

Abstract

Abstract

Finnish nouns are characterized by rich inflectional variation, with obligatory marking of case and number, with optional possessive suffixes and with the possibility of further cliticization. We present a model for the conceptualization of Finnish inflected nouns, using pre-compiled fasttext embeddings (300-dimensional semantic vectors that approximate words’ meanings). Instead of deriving the semantic vector of an inflected word from another word in its paradigm, we propose that an inflected word is conceptualized by means of summation of latent vectors representing the meanings of its lexeme and its inflectional features. We tested this model on the 2,000 most frequent Finnish nouns and their inflected word forms from a corpus of Finnish (84 million tokens). Visualization of the semantic space of Finnish using t-SNE clarified that a ‘main effects’ additive model does not do justice to the semantics of inflection. In Finnish, how number is realized turns out to vary substantially with case. Further interactions emerged with the possessive suffixes and the clitics. By taking these interactions into account, the accuracy of our model, evaluated with the fasttext embeddings as gold standard, improved from 76% to 89%. Analyses of the errors made by the model clarified that 7.5% of errors are due to overabundance (and hence not true errors), and that 16.5% of the errors involved exchanges of semantically highly similar stems (lexemes). Our results indicate, first, that the semantics of Finnish noun inflection are more intricate than assumed thus far, and second, that these intricacies can be captured with surprisingly high accuracy by a simple generating model based on imputed semantic vectors for lexemes, inflectional features, and interactions of inflectional features.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/ml.22008.nik
2023-03-17
2024-09-17
Loading full text...

Full text loading...

/deliver/fulltext/ml.22008.nik.html?itemId=/content/journals/10.1075/ml.22008.nik&mimeType=html&fmt=ahah

References

  1. Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., and Blevins, J.
    (2019) The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity. 10.1155/2019/4895891
    https://doi.org/10.1155/2019/4895891 [Google Scholar]
  2. Blevins, J. P.
    (2016) Word and paradigm morphology. Oxford University Press. 10.1093/acprof:oso/9780199593545.001.0001
    https://doi.org/10.1093/acprof:oso/9780199593545.001.0001 [Google Scholar]
  3. Boleda, G.
    (2020) Distributional Semantics and Linguistic Theory. Annual Review of Linguistics, 61:213–234. 10.1146/annurev‑linguistics‑011619‑030303
    https://doi.org/10.1146/annurev-linguistics-011619-030303 [Google Scholar]
  4. Booij, G. E.
    (1996) Inherent versus contextual inflection and the split morphology hypothesis. InBooij, G. E. and Marle, J. V., editors, Yearbook of Morphology 1995, pages1–16. Kluwer Academic Publishers, Dordrecht. 10.1007/978‑94‑017‑3716‑6_1
    https://doi.org/10.1007/978-94-017-3716-6_1 [Google Scholar]
  5. Brunila, M. and LaViolette, J.
    (2022) What company do words keep? revisiting the distributional semantics of jr firth & zellig harris. arXiv preprint arXiv:2205.07750.
    [Google Scholar]
  6. Bybee, J. L.
    (1985) Morphology: A study of the relation between meaning and form. Benjamins, Amsterdam. 10.1075/tsl.9
    https://doi.org/10.1075/tsl.9 [Google Scholar]
  7. Chen, J. and Chen, Z.
    (2008) Extended bayesian information criteria for model selection with large model spaces. Biometrika, 95(3):759–771. 10.1093/biomet/asn034
    https://doi.org/10.1093/biomet/asn034 [Google Scholar]
  8. Chuang, Y. Y., Brown, D., Evans, R. and Baayen, R. H.
    (2022) Paradigm gaps are associated with weird “distributional semantics”. Russian defective nouns and their case and number paradigms.
    [Google Scholar]
  9. Epskamp, S., Borsboom, D., and Fried, E. I.
    (2018) Estimating psychological networks and their accuracy: A tutorial paper. Behavior research methods, 50(1):195–212. 10.3758/s13428‑017‑0862‑1
    https://doi.org/10.3758/s13428-017-0862-1 [Google Scholar]
  10. Epskamp, S., Cramer, A. O., Waldorp, L. J., Schmittmann, V. D., and Borsboom, D.
    (2012) qgraph: Network visualizations of relationships in psychometric data. Journal of statistical software, 481:1–18. 10.18637/jss.v048.i04
    https://doi.org/10.18637/jss.v048.i04 [Google Scholar]
  11. Firth, J. R.
    (1968) Selected papers of J R Firth, 1952–59. Indiana University Press.
    [Google Scholar]
  12. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., and Mikolov, T.
    (2018) Learning word vectors for 157 languages. arXiv preprint arXiv:1802.06893.
    [Google Scholar]
  13. Günther, F., Rinaldi, L., and Marelli, M.
    (2019) Vector-Space Models of Semantic Representation From a Cognitive Perspective: A Discussion of Common Misconceptions. Perspectives on Psychological Science, 14(6):1006–1033. 10.1177/1745691619861372
    https://doi.org/10.1177/1745691619861372 [Google Scholar]
  14. Harris, Z. S.
    (1954) Distributional Structure. WORD, 10(2–3). 10.1080/00437956.1954.11659520
    https://doi.org/10.1080/00437956.1954.11659520 [Google Scholar]
  15. Karlsson, F.
    (1983) Suomen kielen äänne-ja muotorakenne [the phonological and morphological structure of finnish]. Werner Söderström, Juva.
    [Google Scholar]
  16. (1985) Paradigms and word forms. Studia gramatyczne, 71:135–154.
    [Google Scholar]
  17. (1986) Frequency considerations in morphology. STUF-Language Typology and Universals, 39(1–4):19–28. 10.1524/stuf.1986.39.14.19
    https://doi.org/10.1524/stuf.1986.39.14.19 [Google Scholar]
  18. (2017) Finnish: A comprehensive grammar. Routledge. 10.4324/9781315743547
    https://doi.org/10.4324/9781315743547 [Google Scholar]
  19. Karlsson, F. and Koskenniemi, K.
    (1985) A process model of morphology and lexicon. Folia Linguistica, 291:207–231. 10.1515/flin.1985.19.1‑2.207
    https://doi.org/10.1515/flin.1985.19.1-2.207 [Google Scholar]
  20. Krijthe, J. H.
    (2015) Rtsne: T-Distributed Stochastic Neighbor Embedding using Barnes-Hut Implementation. R package version 0.16.
    [Google Scholar]
  21. Laine, M., Kujala, P., Niemi, J., and Uusipaikka, E.
    (1992) On the nature of naming difficulties in aphasia. Cortex, 28(4):537–554. 10.1016/S0010‑9452(13)80226‑2
    https://doi.org/10.1016/S0010-9452(13)80226-2 [Google Scholar]
  22. Landauer, T. and Dumais, S.
    (1997) A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240. 10.1037/0033‑295X.104.2.211
    https://doi.org/10.1037/0033-295X.104.2.211 [Google Scholar]
  23. Marelli, M. and Baroni, M.
    (2015) Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122(3):485–515. 10.1037/a0039267
    https://doi.org/10.1037/a0039267 [Google Scholar]
  24. Mikolov, T., Chen, K., Corrado, G., and Dean, J.
    (2013) Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 – Workshop Track Proceedings, pages1–12.
    [Google Scholar]
  25. Nikolaev, A., Ashaie, S., Hallikainen, M., Hänninen, T., Higby, E., Hyun, J., Lehtonen, M., and Soininen, H.
    (2019) Effects of morphological family on word recognition in normal aging, mild cognitive impairment, and alzheimer’s disease. Cortex, 1161:91–103. 10.1016/j.cortex.2018.10.028
    https://doi.org/10.1016/j.cortex.2018.10.028 [Google Scholar]
  26. Schreuder, R. and Baayen, R. H.
    (1997) How complex simplex words can be. Journal of Memory and Language, 371:118–139. 10.1006/jmla.1997.2510
    https://doi.org/10.1006/jmla.1997.2510 [Google Scholar]
  27. Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., and Baayen, R. H.
    (2022) Semantic properties of english nominal pluralization: Insights from word embeddings. arXiv.
    [Google Scholar]
  28. Shafaei-Bajestan, Elnaz, Uhrig, Peter and Baayen, R. H.
    (2023) Making sense of spoken plurals. 10.1075/ml.22011.sha
    https://doi.org/10.1075/ml.22011.sha [Google Scholar]
  29. Shahmohammadi, H., Lensch, H., and Baayen, R. H.
    (2021) Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. CoNLL 2021. arXiv preprint arXiv:2104.07500. 10.18653/v1/2021.conll‑1.12
    https://doi.org/10.18653/v1/2021.conll-1.12 [Google Scholar]
  30. Sinclair, J.
    (1991) Corpus, concordance, collocation. Describing English language. Oxford University Press, Oxford.
    [Google Scholar]
  31. Tibshirani, R.
    (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.
    [Google Scholar]
  32. van der Maaten, L.
    (2014) Accelerating t-sne using tree-based algorithms. Journal of Machine Learning Research, 151:3221–3245.
    [Google Scholar]
  33. van der Maaten, L. and Hinton, G.
    (2008) Visualizing high-dimensional data using t-sne. Journal of Machine Learning Research, 91:2579–2605.
    [Google Scholar]
  34. Wang, B., Wang, A., Chen, F., Wang, Y., and Kuo, C. C.
    (2019) Evaluating word embedding models: Methods and experimental results. APSIPA Transactions on Signal and Information Processing, 81(May):e19. 10.1017/ATSIP.2019.12
    https://doi.org/10.1017/ATSIP.2019.12 [Google Scholar]
/content/journals/10.1075/ml.22008.nik
Loading
/content/journals/10.1075/ml.22008.nik
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error