Frequency of use and basic vocabulary
We use corpora from 18 languages to study the frequency of basic words such as mother, sun, and red. We compare three lists, Swadesh-200, Swadesh-100, and the Leipzig-Jakarta list (Tadmor 2009), and find that they have a high average inter-correlation. Using the WOLD semantic categories and fields (Haspelmath and Tadmor 2009), we find regularities in the word meaning types that are most likely to deviate from the overall correlations, i.e. words whose frequency-of-use varies significantly, such as those encoded by function words and basic actions (do/make), spatial relations (left, right), cognition words (to know, when), or possession (to take). Our results indicate a core collection of basic meanings universally used with similar regularity, despite other linguistic pressures impinging on these frequencies.