Volume 17, Issue 3
  • ISSN 1871-1340
  • E-ISSN: 1871-1375



We used word embeddings to study the relation between productivity and semantic transparency. We compiled a dataset with around 2700 two-syllable compounds that shared position-specific constituents (henceforth pivots) and some 1100 suffixed words. For each pivot and suffix, we calculated measures of productivity as well as measures of semantic transparency. For compounds, productivity () was negatively correlated with the number of types () and with the semantic similarity between non-pivot constituents and their compounds. Conversely, the greater semantic similarity of the pivot with either the compound or the non-pivot constituent predicted higher degrees of productivity. Visualization with t-SNE revealed clustering of suffixed words’ embeddings, but no by-pivot clustering for compounds, except for a minority of pivots whose regions in semantic space did not contain intruding unrelated compounds. A subset of these pivots was found to realize a fixed shift in semantic space from the base word to the corresponding compound, a property that also emerged for several suffixes. For these pivots, no correlation between and was present. Thus, Mandarin compounds appear to realize, at one extreme, motivated but unsystematic concept formation (where other pivots could just as well have been used), and at the other extreme, systematic suffix-like semantics.

Available under the CC BY 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Aronoff, M.
    (1976) Word formation in generative grammar. MIT Press, Cambridge, Mass.
    [Google Scholar]
  2. Baayen, R. H.
    (2001) Word frequency distributions. Kluwer Academic Publishers, Dordrecht. 10.1007/978‑94‑010‑0844‑0
    https://doi.org/10.1007/978-94-010-0844-0 [Google Scholar]
  3. (2009) Corpus linguistics in morphology: morphological productivity. InKytö, M. and Lüdeling, A., editors, Corpus Linguistics. An international handbook, pages900–919. Mouton de Gruyter, Berlin. 10.1515/9783110213881.2.899
    https://doi.org/10.1515/9783110213881.2.899 [Google Scholar]
  4. Baayen, R. H., Janda, L. A., Nesset, T., Endresen, A., and Makarova, A.
    (2013) Making choices in Russian: Pros and cons of statistical methods for rival forms. Russian Linguistics, 371:253–291. 10.1007/s11185‑013‑9118‑6
    https://doi.org/10.1007/s11185-013-9118-6 [Google Scholar]
  5. Baayen, R. H. and Lieber, R.
    (1991) Productivity and English derivation: a corpus-based study. Linguistics, 291:801–843. 10.1515/ling.1991.29.5.801
    https://doi.org/10.1515/ling.1991.29.5.801 [Google Scholar]
  6. Baayen, R. H., Piepenbrock, R., and Gulikers, L.
    (1995) The CELEX lexical database (CD-ROM). Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA.
    [Google Scholar]
  7. Bauer, L.
    (1983) English word formation. Cambridge University Press, Cambridge. 10.1017/CBO9781139165846
    https://doi.org/10.1017/CBO9781139165846 [Google Scholar]
  8. (2001) Morphological productivity. Cambridge University Press, Cambridge. 10.1017/CBO9780511486210
    https://doi.org/10.1017/CBO9780511486210 [Google Scholar]
  9. Booij, G. E.
    (1977) Dutch morphology. A study of word formation in generative grammar. Foris, Dordrecht. 10.1515/9783112327708
    https://doi.org/10.1515/9783112327708 [Google Scholar]
  10. Ceccagno, A. and Basciano, B.
    (2007) Compound headedness in Chinese: An analysis of neologisms. Morphology, 17(2):207–231. 10.1007/s11525‑008‑9119‑0
    https://doi.org/10.1007/s11525-008-9119-0 [Google Scholar]
  11. Corbin, D.
    (1987) Morphologie derivationelle et structuration du lexique [Derivational morphology and lexical structure]. Niemeyer, Tubingen.
    [Google Scholar]
  12. Gale, W. A. and Sampson, G.
    (1995) Good-Turing frequency estimation without tears. Journal of Quantitative Linguistics, 21:217–237. 10.1080/09296179508590051
    https://doi.org/10.1080/09296179508590051 [Google Scholar]
  13. Good, I. J.
    (1953) The population frequencies of species and the estimation of population parameters. Biometrika, 401:237–264. 10.1093/biomet/40.3‑4.237
    https://doi.org/10.1093/biomet/40.3-4.237 [Google Scholar]
  14. Levi-Strauss, C.
    (1962) Savage mind. University of Chicago.
    [Google Scholar]
  15. Maaten, L. V. D. and Hinton, G.
    (2008) Visualizing data using t-sne. Journal of machine learning research, 91(Nov):2579–2605.
    [Google Scholar]
  16. Marle, J. V.
    (1985) On the paradigmatic dimensions of morphological creativity. Foris, Dordrecht.
    [Google Scholar]
  17. Riddle, E.
    (1985) A historical perspective on the productivity of the suffixes -ness and -ity. InFisiak, J., editor, Historical Semantics, Historical Word-Formation, pages435–461. Mouton, New York. 10.1515/9783110850178.435
    https://doi.org/10.1515/9783110850178.435 [Google Scholar]
  18. Schultink, H.
    (1961) Produktiviteit als morfologisch fenomeen [Productivity as a morphological phenomenon]. Forum der Letteren, 21:110–125.
    [Google Scholar]
  19. Shen, T. and Baayen, R. H.
    (2021) Adjective-noun compounds in mandarin: a study on productivity. Corpus Linguistics and Linguistic Theory, 18(3):543–572. 10.1515/cllt‑2020‑0059
    https://doi.org/10.1515/cllt-2020-0059 [Google Scholar]
  20. Tarasova, E.
    (2013) Some new insights into the semantics of English N+ N compounds. PhD thesis, Open AccessVictoria University of Wellington Te Herenga Waka.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error