Volume 39, Issue 1
  • ISSN 0176-4225
  • E-ISSN: 1569-9714



While analysing lexical data of Western Kho-Bwa languages of the Sino-Tibetan or Trans-Himalayan family with the help of a computer-assisted approach for historical language comparison, we observed gaps in the data where one or more varieties lacked forms for certain concepts. We employed a new workflow, combining manual and automated steps, to predict the most likely phonetic realisations of the missing forms in our data, by making systematic use of the information on sound correspondences in words that were potentially cognate with the missing forms. This procedure yielded a list of hypothetical reflexes of previously identified cognate sets, which we first preregistered as an experiment on the prediction of unattested word forms and then compared with actual word forms elicited during secondary fieldwork. In this study we first describe the workflow which we used to predict hypothetical reflexes and the process of elicitation of actual word forms during fieldwork. We then present the results of our reflex prediction experiment. Based on this experiment, we identify four general benefits of reflex prediction in historical language comparison. These comprise (1) an increased transparency of linguistic research, (2) an increased efficiency of field and source work, (3) an educational aspect which offers teachers and learners a wide plethora of linguistic phenomena, including the regularity of sound change, and (4) the possibility of kindling speakers’ interest in their own linguistic heritage.

Available under the CC BY-NC 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Amery, Rob
    2016Warraparna Kaurna!Adelaide: University of Adelaide Press. 10.20851/kaurna
    https://doi.org/10.20851/kaurna [Google Scholar]
  2. Blevins, Juliette
    2004Evolutionary phonology: The emergence of sound patterns. Cambridge: Cambridge University Press. 10.1017/CBO9780511486357
    https://doi.org/10.1017/CBO9780511486357 [Google Scholar]
  3. Bodt, Timotheus Adrianus
    2014a Ethnolinguistic survey of Westernmost Arunachal Pradesh. A fieldworker’s impressions. Linguistics of the Tibeto-Burman Area37.2: 198–239. 10.1075/ltba.37.2.03bod
    https://doi.org/10.1075/ltba.37.2.03bod [Google Scholar]
  4. 2014b Notes on the settlement of the Gongri River valley of Western Arunachal Pradesh. InAnna Balikci Denjongpa & Jenny Bentley (eds.), The dragon and the hidden land: Social and historical studies on Sikkim and Bhutan. Proceedings of the Bhutan-Sikkim Panel at the 13th Seminar of the International Association for Tibetan Studies, 153–190. Ulaanbataar: International Association for Tibetan Studies.
    [Google Scholar]
  5. 2019 The Duhumbi perspective on Proto-Western Kho-Bwa rhymes. Die Sprache52 (2016 / 2017) 2: 141–176.
    [Google Scholar]
  6. 2021 The Duhumbi perspective on Proto-Western Kho-Bwa onsets. Historical Linguistics11.1: 1–59. 10.1075/jhl.19021.bod
    https://doi.org/10.1075/jhl.19021.bod [Google Scholar]
  7. Bodt, Timotheus A. & Johann-Mattis List
    2019 Testing the predictive strength of the comparative method: An ongoing experiment on unattested words in Western Kho- Bwa languages. Papers in Historical Phonology4 (1): 22–44. 10.2218/pihph.4.2019.3037
    https://doi.org/10.2218/pihph.4.2019.3037 [Google Scholar]
  8. 2020 The multiple benefits of making predictions in linguistics. Babel: The Language Magazine31: 8–12.
    [Google Scholar]
  9. Bodt, Timotheus A., Nathan W. Hill & Johann-Mattis List
    2018 Prediction experiment for missing words in Kho-Bwa language data. Open science framework preregistrationsOctober5. https://osf.io/evcbp/
    [Google Scholar]
  10. Branner, David Prager
    2006 Some composite phonological systems in Chinese. InDavid Prager Branner (ed.), The Chinese rime tables: Linguistic philosophy and historical-comparative phonology, 209–232. Amsterdam: Benjamins. 10.1075/cilt.271.15bra
    https://doi.org/10.1075/cilt.271.15bra [Google Scholar]
  11. Driem, George van
    2001Languages of the Himalayas: An ethnolinguistic handbook of the Greater Himalayan Region. 2. Leiden: Brill. 10.1163/9789004492530
    https://doi.org/10.1163/9789004492530 [Google Scholar]
  12. Eberhard, David M., Gary F. Simons & Charles D. Fennig
    (eds.) 2019Ethnologue: Languages of the world. Twenty-second edition. Dallas, Texas: SIL International. https://www.ethno​logue.com
    [Google Scholar]
  13. Forkel, Robert, Johann-Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank, Michael Cysouw, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping & Russell D. Gray
    2018 Cross-linguistic data formats, advancing data sharing and re-use in comparative linguistics. Scientific Data5 (180205): 1–10. 10.1038/sdata.2018.205
    https://doi.org/10.1038/sdata.2018.205 [Google Scholar]
  14. Genetti, Carol
    2016 The Tibeto-Burman languages of South Asia: The languages, histories, and genetic classification. InHans Heinrich Hock & Elena Bashir (eds.), The languages and linguistics of South Asia: A comprehensive guide, 130–154. Berlin: Mouton de Gruyter.
    [Google Scholar]
  15. Greenberg, Joseph H.
    1963 Some universals of grammar with particular reference to the order of meaningful elements. InJoseph H. Greenberg, Universals of human language, 73–113. Cambridge, Mass: MIT Press.
    [Google Scholar]
  16. Grimm, Jacob
    1822Deutsche Grammatik. Erster Theil. Göttingen: Dieterich.
    [Google Scholar]
  17. Hammarström, Harald, Robert Forkel & Martin Haspelmath
    2020Glottolog. Version 4.2.1. Jena, Max Planck Institute for the Science of Human History. glottolog.org
    [Google Scholar]
  18. Lieberherr, Ismael & Timotheus Adrianus Bodt
    2017 Sub-grouping Kho-Bwa based on shared core vocabulary. Himalayan Linguistics16 (2): 25–63.
    [Google Scholar]
  19. List, Johann-Mattis
    2017 A web-based interactive tool for creating, inspecting, editing, and publishing etymological datasets. InProceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. System Demonstrations, 9–12. 10.18653/v1/E17‑3003
    https://doi.org/10.18653/v1/E17-3003 [Google Scholar]
  20. 2019 Automatic inference of sound correspondence patterns across multiple languages. Computational Linguistics1 (45): 137–61. doi:  10.1162/coli_a_00344
    https://doi.org/10.1162/coli_a_00344 [Google Scholar]
  21. Michael, Lev, Natalia Chousou-Polydouri, Keith Bartolomei, Erin Donnelly, Vivian Wauters, Sérgio Meira & Zachary O’Hagan
    2015 A Bayesian phylogenetic classification of Tupí-Guaraní. LIAMES15 (2): 193–221. 10.20396/liames.v15i2.8642301
    https://doi.org/10.20396/liames.v15i2.8642301 [Google Scholar]
  22. Nosek, Brian, Emorie D. Beck, Lorne Campell, Jessica K. Flake, Tom E. Hardwicke, David T. Mellor, Anna E. van ‘t Veer & Simine Vazire
    2019 Preregistration is hard, and worthwhile. Trends in Cognitive Sciences23(10): 815–818. 10.1016/j.tics.2019.07.009
    https://doi.org/10.1016/j.tics.2019.07.009 [Google Scholar]
  23. Post, Mark W. & Robbins Burling
    2017 The Tibeto-Burman languages of Northeastern India. InGraham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 213–233. Abingdon: Routledge.
    [Google Scholar]
  24. Schweikhard, N. & J.-M. List
    2020 Developing an annotation framework for word formation processes in comparative linguistics. SKASE Journal of Theoretical Linguistics17.1: 2–26.
    [Google Scholar]
  25. Sims-Williams, P.
    2018 Mechanising historical phonology. Transactions of the Philological Society. 116.3: 555–573. 10.1111/1467‑968X.12138
    https://doi.org/10.1111/1467-968X.12138 [Google Scholar]
  26. Watkins, C.
    1962Indo-European origins of the Celtic verb. Volume I. The sigmatic aorist. Dublin: Dublin Institute for Advanced Studies.
    [Google Scholar]
  27. Wu, M.-S., N. Schweikhard, T. Bodt, N. Hill & J.-M. List
    2020 Computer-assisted language comparison: State of the art. Journal of Open Humanities Data6.2: 1–14. 10.5334/johd.12
    https://doi.org/10.5334/johd.12 [Google Scholar]
  28. Anderson, Cormac, Tiago Tresoldi, Thiago Costa Chacon, Anne-Maria Fehn, Mary Walworth, Robert Forkel & Johann-Mattis List
    2018 A cross-linguistic database of phonetic transcription systems. Yearbook of the Poznań Linguistic Meeting4 (1). 21–53. 10.2478/yplm‑2018‑0002
    https://doi.org/10.2478/yplm-2018-0002 [Google Scholar]
  29. Forkel, R. & J.-M. List
    2020 CLDFBench. Give your cross-linguistic data a lift. In: Proceedings of the Twelfth International Conference on Language Resources and Evaluation, 6997–7004. www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.864.pdf
    [Google Scholar]
  30. List, J.
    2018 LingRex: Linguistic Reconstruction with LingPy. Version 0.1.1. Max Planck Institute for the Science of Human History: Jena. doi:  10.5281/zenodo.1544944
    https://doi.org/10.5281/zenodo.1544944 [Google Scholar]
  31. List, J.-M., M. Cysouw & R. Forkel
    2016 Concepticon. A resource for the linking of concept lists. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation, 2393–2400.
    [Google Scholar]
  32. List, J.-M., S. Greenhill, T. Tresoldi & R. Forkel
    2019a LingPy. A Python library for quantitative tasks in historical linguistics. Version 2.6.5. Max Planck Institute for the Science of Human History: Jena. lingpy.org
  33. List, J.-M., C. Anderson, T. Tresoldi, C. Rzymski, S. Greenhill & R. Forkel
    2019b Cross-linguistic transcription systems. Version 1.3.0. Max Planck Institute for the Science of Human History: Jena. https://clts.clld.org
  34. List, J., C. Rzymski, S. Greenhill, N. Schweikhard, K. Pianykh, A. Tjuka, M. Wu & R. Forkel
    2020 Concepticon. A resource for the linking of concept lists (Version 2.3.0). Version 2.3.0. Max Planck Institute for the Science of Human History: Jena. https://concepticon.clld.org/

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error