The Rendaku Database
Of those studies which have examined rendaku from a statistical angle, most have been small-scale, employing restricted corpora or micro-databases and often focused on speciﬁc conditions. The lack of a large-scale corpus was the impetus behind the creation of the Rendaku Database, available online for ongoing and future research. In this paper, both the initial and non-initial elements of approximately 28,000 compounds are subjected to a detailed analysis: by vocabulary stratum, length, part of speech, accent pattern, and frequency, as well as by the value of the moras straddling either side of the element boundary. Among the core ﬁndings are that initial elements which are verbs show aberrantly low rendaku rates, while non-initial elements which are deadjectival nouns, and those which begin in h, both exhibit considerably higher than average rendaku rates.