Volume 38, Issue 4
  • ISSN 0176-4225
  • E-ISSN: 1569-9714
Buy:$35.00 + Taxes



Based on a new reconstruction of Proto-Basque, and regular sound correspondences between this Proto-Basque and Proto-Indo-European as standardly reconstructed, Blevins (2018) argues that Proto-Basque and Proto-Indo-European have a common ancestor that pre-dates the two proto-languages. Part of this argument is based on proposed Proto-Indo-European/Proto-Basque cognate sets that include basic vocabulary items. In this study we offer statistical support for Blevins’ conclusions by using a Monte Carlo simulation that allows us to estimate the probability that the proposed lexical correspondences could have arisen by chance. The method makes use of phonotactic language models to generate possible words in a pair of languages, and then attempts to discover consistent correspondences between the words, producing a list of possible “cognates”. The method differs from some previous approaches by considering matches between all segments in the word pairs. By running such a simulation a large number of times, we can estimate the probability that two languages with the given phonotactics could have produced the number of cognate pairs observed in the actual data. The method is independently assessed by comparing wordlists from 100 pairs of languages, related and unrelated, where relations are known. Our conclusion is that the proposed correspondences are unlikely to have arisen by chance, supporting a distant relationship between Proto-Basque as reconstructed by Blevins (2018) and Proto-Indo-European.


Article metrics loading...

Loading full text...

Full text loading...


  1. Albright, Adam
    2009 Feature-based generalisation as a source of gradient acceptability. Phonology26:9–41. 10.1017/S0952675709001705
    https://doi.org/10.1017/S0952675709001705 [Google Scholar]
  2. Ariztimuño, Borja, Eneko Zuloaga & Dorota Krajewska
    2019 Against the Proto-Indo-European hypothesis, or why Basque continues to be a language isolate. Talk presented at theSocietas Linguistica Europaea 52nd Annual Meeting, Leipzig, Germany.
    [Google Scholar]
  3. Blasi, Damián E., Søren Wichmann, Harald Hammarström, Peter F. Stadler & Morten H. Christiansen
    2016 Sound-meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Sciences. 113(39): 10818–10823. 10.1073/pnas.1605782113
    https://doi.org/10.1073/pnas.1605782113 [Google Scholar]
  4. Blevins, Juliette
    2018Advances in Proto-Basque reconstruction with evidence for the Proto-Indo-European-Euskarian hypothesis. London & New York: Routledge. 10.4324/9780429505911
    https://doi.org/10.4324/9780429505911 [Google Scholar]
  5. 2020 Derivational patterns in Proto-Basque word structure. InPavel Stekauer and Lívia Körtvélyessy (eds.), The complexity of complex words, 222–243. Cambridge: Cambridge University Press. 10.1017/9781108780643.011
    https://doi.org/10.1017/9781108780643.011 [Google Scholar]
  6. Blust, Robert
    2013The Austronesian languages. Canberra: Pacific Linguistics.
    [Google Scholar]
  7. Blust, Robert & Stephen Trussel
    . ongoing. The Austronesian comparative dictionary. Revision12/17/2016. www.trussel2.com/acd/
  8. Buck, Carl Darling
    1949A dictionary of selected synonyms in the principal Indo-European languages. Chicago: University of Chicago Press.
    [Google Scholar]
  9. Campbell, Lyle & William J. Poser
    2008Language classification: History and method. Cambridge: Cambridge University Press. 10.1017/CBO9780511486906
    https://doi.org/10.1017/CBO9780511486906 [Google Scholar]
  10. Covington, Michael
    1996 An algorithm to align words for historical comparison. Computational Linguistics22:481–496.
    [Google Scholar]
  11. Derksen, Rick
    2008Etymological dictionary of the Slavic inherited lexicon. Leiden: Brill.
    [Google Scholar]
  12. Dockum, Rikker & Claire Bowern
    2019 Swadesh lists are not long enough: Drawing phonological generalizations from limited data. Language Documentation and Description16: 35–54.
    [Google Scholar]
  13. Dolgopolsky, Aharon
    1964 Гипотеза древнейшего родства языковых семей северной евразии с вероятностной точки зрения. [A probabilistic hypothesis concerning the oldest relationships among the languages of northern Eurasia.] Voprosy yazykoznaniya. 2: 53–63.
    [Google Scholar]
  14. Dunkel, George E.
    2014Lexikon der indogermanischen Partikeln und Pronominalstämme (2vols.). Heidelberg: Universitätsverlag Winter.
    [Google Scholar]
  15. Dunn, Michael & Angela Terrill
    2012 Assessing the evidence for a Central Solomons Papuan family using the Oswalt Monte Carlo Test. Diachronica29:1–27. 10.1075/dia.29.1.01dun
    https://doi.org/10.1075/dia.29.1.01dun [Google Scholar]
  16. Egurtzegi, Ander
    2013 Phonetics and phonology. InMartínez-Areta, Mikel (ed.) 2013 Basque and Proto-Basque. Language-internal and typological approaches to linguistic reconstruction [Mikroglottika 5], 119–172. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  17. 2014 Towards a phonetically grounded diachronic phonology of Basque. PhD dissertation, Euskal Herriko Unibertsitatea.
    [Google Scholar]
  18. Fortson, Benjamin W. IV
    2010Indo-European language and culture: An introduction. Second edition. Oxford: Wiley-Blackwell.
    [Google Scholar]
  19. François, Alexandre
    2008 Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. InM. Vanhove (ed.), From polysemy to semantic change, 163–215. Amsterdam: John Benjamins. 10.1075/slcs.106.09fra
    https://doi.org/10.1075/slcs.106.09fra [Google Scholar]
  20. Gamkrelidze, Thomas & Vjačeslav Ivanov
    1995Indo-European and the Indo-Europeans. (Trans.Johanna Nichols), Berlin and New York: Mouton de Gruyter. 10.1515/9783110815030
    https://doi.org/10.1515/9783110815030 [Google Scholar]
  21. GCSE
    GCSE (no date) (9–1) Classical Greek J292/01 Language defined vocabulary list and restricted vocabulary list. https://www.ocr.org.uk/Images/221511-gcse-classical-greek-j292-defined-vocabulary-list-and-restricted-vocabulary-list.pdf
  22. Goddard, Ives
    1975 Algonquian, Wiyot, and Yurok: Proving a distant genetic relationship. InM. Dale Kinade, Kenneth L. Hale & Oswald Werner (eds.), Linguistics and anthropology: In honor of C.F. Voegelin, 249–262. Lisse: Peter de Ridder Press. 10.1515/9783112420461
    https://doi.org/10.1515/9783112420461 [Google Scholar]
  23. Gorman, Kyle
    2016 Pynini: A Python library for weighted finite-state grammar compilation. InProceedings of the ACL Workshop on Statistical NLP and Weighted Automata, 75–80. 10.18653/v1/W16‑2409
    https://doi.org/10.18653/v1/W16-2409 [Google Scholar]
  24. Gorrochategui, Joaquín
    1984Estudio sobre la onomástica indígena de Aquitania. Bilbao: University of the Basque Country and University of Salamanca.
    [Google Scholar]
  25. Haspelmath, Martin & Uri Tadmor
    (eds.) 2009Loanwords in the world’s languages: A comparative handbook. Berlin: Mouton de Gruyter. 10.1515/9783110218442
    https://doi.org/10.1515/9783110218442 [Google Scholar]
  26. Holman, Eric W., Søren Wichmann, Cecil H. Brown, Viveka Velupillai, André Müller & Dik Bakker
    2008 Explorations in automated language classification. Folia Linguistica42: 331–354. 10.1515/FLIN.2008.331
    https://doi.org/10.1515/FLIN.2008.331 [Google Scholar]
  27. Jäger, Gerhard
    2013 Phylogenetic inference from word lists using weighted alignment with empirically determined weights. Language Dynamics and Change3(2): 245–291. 10.1163/22105832‑13030204
    https://doi.org/10.1163/22105832-13030204 [Google Scholar]
  28. Jäger, Gerhard, Johann-Mattis List & Pavel Sofroniev
    2017 Using support-vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists. InProceedings of the European ACL 2017, 1205–1216. 10.18653/v1/E17‑1113
    https://doi.org/10.18653/v1/E17-1113 [Google Scholar]
  29. Jansche, Martin
    2003 Inference of string mappings for speech technology. PhD dissertation, The Ohio State University.
    [Google Scholar]
  30. Johansson, Niklas Erben, Andrey Anikin, Gerd Carling & Arthur Holmer
    2020 The typology of sound symbolism: Defining macro-concepts via their semantic and phonetic features. Linguistic Typology. 10.1515/lingty‑2020‑2034
    https://doi.org/10.1515/lingty-2020-2034 [Google Scholar]
  31. Jurafsky, Dan & James Martin
    2018Speech and language processing. Third edition draft. https://web.stanford.edu/~jurafsky/slp3/
    [Google Scholar]
  32. Kassian, Alexei, Mikhail Zhivlov & George Starostin
    2015 Proto-Indo-European-Uralic comparison from the probabilistic point of view. The Journal of Indo-European Studies. 43(3–4): 301–347.
    [Google Scholar]
  33. Kessler, Brett
    2001The significance of word lists: Statistical tests for investigating historical connections between languages. Stanford, CA: CSLI Publications. Distributed by The University of Chicago Press.
    [Google Scholar]
  34. 2015 Computational and quantitative approaches to historical phonology. InP. Honeybone & J. Salmons (eds.), The Oxford handbook of historical phonology, 133–148. Oxford: Oxford University Press.
    [Google Scholar]
  35. Kloekhorst, Alwin
    2008Etymological dictionary of the Hittite inherited lexicon. Amsterdam: Brill.
    [Google Scholar]
  36. Kondrak, Grzegorz
    2000 A new algorithm for the alignment of phonetic sequences. InProceedings of NAACL 2000, 288–295. San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
    [Google Scholar]
  37. 2002 Algorithms for language reconstruction. PhD dissertation, University of Toronto.
    [Google Scholar]
  38. Lakarra, Joseba, Julen Manterola & Iñaki Segurola
    (eds.) 2019Euskal Hiztegi Historiko-Etimologikoa (EHHE-200). Bilbo: Euskaltzaindia.
    [Google Scholar]
  39. Linguistics Research Center
    Linguistics Research Center (no date). Ancient Sanskrit online. Sanskrit base form dictionary. Linguistics Research Center, University of Texas at Austin. https://lrc.la.utexas.edu/eieol_base_form_dictionary/vedol/7
    [Google Scholar]
  40. List, Johann-Mattis
    2012 SCA. Phonetic alignment based on sound classes. InM. Slavkovik & D. Lassiter (eds.), New directions in logic, language, and computation, 32–51. Berlin and Heidelberg: Springer. 10.1007/978‑3‑642‑31467‑4_3
    https://doi.org/10.1007/978-3-642-31467-4_3 [Google Scholar]
  41. 2014Sequence comparison in historical linguistics. Düsseldorf: Düsseldorf University Press.
    [Google Scholar]
  42. List, Johann-Mattis & Steven Moran
    2013 An open source toolkit for quantitative historical linguistics. InProceedings of the ACL 2013 System Demonstrations.
    [Google Scholar]
  43. List, Johann-Mattis, Thomas Mayer, Anselm Terhalle & Matthias Urban
    (eds.) 2014CLICS: Database of cross-linguistic colexifications. Marburg: Forschungszentrum Deutscher Sprachatlas. clics.lingpy.org
    [Google Scholar]
  44. List, Johann-Mattis, Simon Greenhill, Cormac Anderson, Thomas Mayer, Tiago Tresoldi & Rober Forkel
    (eds.) 2019 CLICS3. [accessed atclics.clid.org]
  45. List, Johann-Mattis, Mary Walworth, Simon Greenhill, Tiago Tresoldi & Robert Forkel
    2018 Sequence comparison in computational historical linguistics. Journal of Language Evolution3(2): 130–144. 10.1093/jole/lzy006
    https://doi.org/10.1093/jole/lzy006 [Google Scholar]
  46. Martínez-Areta, Mikel
    (ed.) 2013Basque and Proto-Basque. Language-internal and typological approaches to linguistic reconstruction [Mikroglottika 5]. Frankfurt am Main: Peter Lang. 10.3726/978‑3‑653‑02701‑3
    https://doi.org/10.3726/978-3-653-02701-3 [Google Scholar]
  47. Michelena, Luis
    1961Fonética histórica vasca. First edition. Donostia-San Sebastián.
    [Google Scholar]
  48. 1977 [2011]. Fonética histórica vasca (Luis Michelena. Obras Completas VI. 2nd edition. InJ. A. Lakarra & I. Ruiz Arzalluz (eds.), Obras completas VI, Supplements of ASJU59. Donostia-San Sebastián. Donostia-San Sebastian, Vitoria-Gasteiz: Diputación Foral de Guipuzcoa, University of the Basque Country.
    [Google Scholar]
  49. Michelena, Luis & Ibon Sarasola
    1987–2005Orotariko Euskal Hiztegia [OEH], [General Basque Dictionary]. 16volumes. Bilbao: Euskaltzaindia. [updated online version accessed atwww.euskaltzaindia.eus/oeh]
    [Google Scholar]
  50. Nichols, Johanna
    1996 The comparative method as heuristic. InM. Durie & M. Ross (eds.), The comparative method reviewed: Regularity and irregularity in language change, 39–71. Oxford: Oxford University Press.
    [Google Scholar]
  51. Orel, Vladimir
    1998Albanian etymological dictionary. Leiden: Brill.
    [Google Scholar]
  52. Orotariko Euskal Hiztegia [OEH]
    Orotariko Euskal Hiztegia [OEH]. SeeMichelena, Luis & Ibon Sarasola.
  53. Oswalt, Robert L.
    1970 The detection of remote linguistic relationships. Computer Studies in the Humanities and Verbal Behavior3:117–129.
    [Google Scholar]
  54. Ratcliffe, Robert
    2015 On calculating the reliability of the comparative method at long and medium distances: Afroasiatic comparative lexica as a test case. Journal of Historical Linguistics. 2(2): 239–281. 10.1075/jhl.2.2.04rat
    https://doi.org/10.1075/jhl.2.2.04rat [Google Scholar]
  55. Ringe, Don, Tandy Warnow & Ann Taylor
    2002 Indo-European and computational cladistics. Transactions of the Philological Society100(1):59–129. 10.1111/1467‑968X.00091
    https://doi.org/10.1111/1467-968X.00091 [Google Scholar]
  56. Ristad, Eric & Peter Yianilos
    1998 Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence20(5):522–532. 10.1109/34.682181
    https://doi.org/10.1109/34.682181 [Google Scholar]
  57. Rix, Helmut, Martin Kümmel, Thomas Zehnder, Reiner Lipp & Brigitte Schirmer
    2001Lexikon der Indogermanischen Verben. Wiesbaden: Dr. Ludwig Reichert Verlag.
    [Google Scholar]
  58. 2014Lexikon der Indogermanischen Verben: Die Wurzeln und ihre Primärstammbildungen. Unter der Leitung von Helmut Rix und der Mitarbeit vieler anderer bearbeitet von Martin Kümmel, Thomas Zehnder, Reiner Lipp, Brigitte Schirmer. Third edition, electronic file, March 2014.
    [Google Scholar]
  59. Roark, Brian, Michael Riley, Cyril Allauzen, Terry Tai & Richard Sproat
    2012 The OpenGrm open-source finite-state grammar software libraries. ACL 2012, Jeju Island, Korea, July.
    [Google Scholar]
  60. Roark, Brian & Richard Sproat
    2007Computational approaches to morphology and syntax. Oxford: Oxford University Press.
    [Google Scholar]
  61. Schrijver, Peter
    2002 Irish ainder, Welsh anner, Breton annoar, Basque andere. InD. Restle & Dietmar Zaefferer (eds.) Sounds and systems: Studies in structures and change: A festschrift for Theo Vennemann, 205–219. Berlin & New York: Mouton de Gruyter. 10.1515/9783110894653.205
    https://doi.org/10.1515/9783110894653.205 [Google Scholar]
  62. Slaska, Natalia
    2006 Meaning lists in lexicostatistical studies: Evaluation, application, ramifications. PhD dissertation, University of Sheffield.
    [Google Scholar]
  63. St. Arnaud, Adam, David Beck & Grzegorz Kondrak
    2017 Identifying cognate sets across dictionaries of related languages. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Copenhagen. 2519–2528. 10.18653/v1/D17‑1267
    https://doi.org/10.18653/v1/D17-1267 [Google Scholar]
  64. Swadesh, Morris
    1952 Lexicostatistic dating of prehistoric ethnic contacts. Proceedings of the American Philosophical Society96: 452–463.
    [Google Scholar]
  65. 1955 Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics21:121–137. 10.1086/464321
    https://doi.org/10.1086/464321 [Google Scholar]
  66. 1971The origin and diversification of language. Joel F. Sherzer (ed.). Chicago: Aldine-Atherton.
    [Google Scholar]
  67. Tadmor, Uri, Martin Haspelmath & Bradley Taylor
    2010 Borrowability and the notion of basic vocabulary. Diachronica27: 226–264. 10.1075/dia.27.2.04tad
    https://doi.org/10.1075/dia.27.2.04tad [Google Scholar]
  68. Tahmasebi, Nina, Lars Borin & Adam Jatowt
    2018 Survey of computational approaches to diachronic conceptual change detection. https://arxiv.org/abs/1811.06278
  69. Tai, Terry, Wojciech Skut & Richard Sproat
    2011 Thrax: An open source grammar compiler built on OpenFst. ASRU 2011, Waikoloa Resort, Hawaii, December.
    [Google Scholar]
  70. Teeter, Karl V.
    1964 Algonquian languages and genetic relationship. InHorace G. Lunt (ed.) Proceedings of the Ninth International Congress of Linguists, 1026–1034. The Hague: Mouton.
    [Google Scholar]
  71. Trask, Robert L.
    1997The history of Basque. London: Routledge.
    [Google Scholar]
  72. 2008 Etymological dictionary of Basque. Posthumous edition. Unpublished. Edited for the web byM. W. Wheeler. University of Sussex.
    [Google Scholar]
  73. Turchin, Peter, Ilia Peiros & Murray Gell-Mann
    2010 Analyzing genetic connections between languages by matching consonant classes. Journal of Language Relationship3: 117–126.
    [Google Scholar]
  74. Uhlenbeck, C. C.
    1909–1910 Contribution à une phonétique comparative des dialectes basques. Revista Internacional de los Estudios Vascos3: 465–503; 4. 65–188.
    [Google Scholar]
  75. Vendryes, Joseph
    1959Lexique étymologique de l’irlandais ancien: Lettre A. Paris: CNRS.
    [Google Scholar]
  76. Wodtko, Dagmar S., Britta Irslinger & Carolin Schneider
    2008Nomina im Indogermanischen Lexikon. Heidelberg: Universitäts verlag Winter.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error