Genetic and linguistic diversity in Central Asia
In this study, we used genetic and linguistic data that we collected in Central Asia, to better understand how genetic and linguistic diversity correlates in a contact zone. We assessed the levels of genetic differentiation with mitochondrial, Y-chromosomal and autosomal data from 26 populations (1300 individuals) from the two major linguistic groups in Central Asia: Indo-European and Altaic. We computed the linguistic distance between populations with lexical data from several individuals per population.Our results show that the genetic diversity in the area clearly clusters in two groups explained by the linguistic, one that includes the Indo-Iranian populations and the other one the Turkic populations except for Uzbek populations. Also, for two populations we have detected a shift in language that occurred likely through elite-dominance effect. Furthermore computing linguistic distances based on lexical data (Levenstein distance) we find a strong correlation between genetic distances and linguistic distances but no correlation between genetic and geographical distances.In conclusion, Central Asia is an area where linguistic but not geography correlates with genetic diversity, highlighting the importance of a cultural trait in shaping genetic diversity in our species.