Volume 29, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



This paper introduces the Lannang Corpus (LanCorp), a public 375,000-word collection of raw and transcribed recordings of Lannang languages spoken in metropolitan Manila, which have been annotated with part-of-speech tags and linked to 40 types of sociolinguistic metadata. It begins by providing an overview of the LanCorp (e.g. design, formats, accessibility). Then, it goes on to show various examples of how the corpus can be used for variationist sociolinguistic research, using Lánnang-uè data as a case study. The findings from the exploratory studies indicate that Lannang languages are influenced by sociolinguistic factors, demonstrating the intricate nature of the Sino-Philippine sociolinguistic ecology. Due to its large size, sociolinguistic metadata, and various formats, LanCorp can be used to study Lannang languages in general and how they are used by specific social groups. It enables scholars to investigate multilingual interactions in a wide range of sociolinguistic factors, furthering the field of Sino-Philippine (socio)linguistics.


Article metrics loading...

Loading full text...

Full text loading...


  1. Anthony, L.
    (2022) AntConc (Version 4.0.5) [Computer software]. Waseda University. https://www.laurenceanthony.net/software
    [Google Scholar]
  2. Benor, S. B.
    (2010) Ethnolinguistic repertoire: Shifting the analytic focus in language and ethnicity. Journal of Sociolinguistics, 14(2), 159–183. 10.1111/j.1467‑9841.2010.00440.x
    https://doi.org/10.1111/j.1467-9841.2010.00440.x [Google Scholar]
  3. Boersma, P., & Weenink, D.
    (2021) Praat: Doing phonetics by computer (6.1.51) [Computer software]. www.praat.org/
    [Google Scholar]
  4. Braunmüller, K., & House, J.
    (2009) Convergence and Divergence in Language Contact Situations. John Benjamins. 10.1075/hsm.8
    https://doi.org/10.1075/hsm.8 [Google Scholar]
  5. Cheng, A.
    (2016) A Survey of English Vowel Spaces of Asian American Californians. UC Berkeley PhonLab Annual Report 2016, 348–384. 10.5070/P7121040736
    https://doi.org/10.5070/P7121040736 [Google Scholar]
  6. Cheng, A., & Cho, S.
    (2021) The effect of ethnicity on identification of Korean American speech. Languages, 6(4), 186. 10.3390/languages6040186
    https://doi.org/10.3390/languages6040186 [Google Scholar]
  7. Cheshire, J., Kerswill, P., Fox, S., & Torgersen, E.
    (2011) Contact, the feature pool and the speech community: The emergence of Multicultural London English. Journal of Sociolinguistics, 15(2), 151–196. 10.1111/j.1467‑9841.2011.00478.x
    https://doi.org/10.1111/j.1467-9841.2011.00478.x [Google Scholar]
  8. Chu, R.
    (2010) Chinese and Chinese Mestizos of Manila: Family, Identity, and Culture, 1860s–1930s. Brill. 10.1163/ej.9789004173392.i‑452
    https://doi.org/10.1163/ej.9789004173392.i-452 [Google Scholar]
  9. (2021) From ‘sangley’ to ‘Chinaman’, ‘Chinese Mestizo’ to ‘Tsinoy’: Unpacking ‘Chinese’ identities in the Philippines at the turn of the twentieth-century. Asian Ethnicity, 24(1), 7–37. 10.1080/14631369.2021.1941755
    https://doi.org/10.1080/14631369.2021.1941755 [Google Scholar]
  10. Chua, D. A.
    (2004) From Chinese to Filipino: Changing Identities of the Chinese in the Philippines [Unpublished master’s thesis]. The University of British Columbia.
    [Google Scholar]
  11. Chuaunsu, R.
    (1989) A Speech Communication Profile of Three Generations of Filipino-Chinese in Metro Manila: Their Use of English, Pilipino and Chinese Languages in Different Domains, Role-Relationships, Speech Situations and Functions [Unpublished master’s thesis]. University of the Philippines.
    [Google Scholar]
  12. Chun, E. W.
    (2001) The construction of white, black, and Korean American identities through African American Vernacular English. Journal of Linguistic Anthropology, 11(1), 52–64. 10.1525/jlin.2001.11.1.52
    https://doi.org/10.1525/jlin.2001.11.1.52 [Google Scholar]
  13. Doeppers, D.
    (1986) Destination, selection and turnover among Chinese migrants to Philippine cities in the nineteenth century. Journal of Historical Geography, 12(4), 235–260. 10.1016/S0305‑7488(86)80176‑1
    https://doi.org/10.1016/S0305-7488(86)80176-1 [Google Scholar]
  14. Dy, C. J.
    (1972) The syntactic structures of Amoy as used in the Philippines. Philippine Journal of Linguistics, 3(2), 75–94.
    [Google Scholar]
  15. ELAN (Version 5.9) [Computer software]
    ELAN (Version 5.9) [Computer software] (2020) Nijmegen: Max Planck Institute for Psycholinguistics, The Language Archive. https://archive.mpi.nl/tla/elan
  16. Fernández, M., & Sippola, E.
    (2017) A new window into the history of Chabacano: Two unknown mid-19th century texts. Journal of Pidgin and Creole Languages, 32(2), 304–338. 10.1075/jpcl.32.2.04fer
    https://doi.org/10.1075/jpcl.32.2.04fer [Google Scholar]
  17. Gonzales, W. D. W.
    (2016) Trilingual code-switching using quantitative lenses: An exploratory study on Hokaglish. Philippine Journal of Linguistics, 471, 106–128.
    [Google Scholar]
  18. (2017a) Language contact in the Philippines: The history and ecology from a Chinese Filipino perspective: The history and ecology from a Chinese Filipino perspective. Language Ecology, 1(2), 185–212. 10.1075/le.1.2.04gon
    https://doi.org/10.1075/le.1.2.04gon [Google Scholar]
  19. (2017b) Philippine Englishes. Asian Englishes, 19(1), 79–95. 10.1080/13488678.2016.1274574
    https://doi.org/10.1080/13488678.2016.1274574 [Google Scholar]
  20. (2021) Filipino, Chinese, neither, or both? The Lannang identity and its relationship with language. Language & Communication, 771, 5–16. 10.1016/j.langcom.2020.11.002
    https://doi.org/10.1016/j.langcom.2020.11.002 [Google Scholar]
  21. (2022a) Hybridization. InA. M. Borlongan (Ed.), Philippine English: Development, Structure, and Sociology of English in the Philippines (pp.170–183). Routledge. 10.4324/9780429427824‑17
    https://doi.org/10.4324/9780429427824-17 [Google Scholar]
  22. (2022b) Interactions of Sinitic Languages in the Philippines: Sinicization, Filipinization, and Sino-Philippine Language Creation. InZ. Ye (Ed.), The Palgrave Handbook of Chinese Language Studies (pp.369–408). Springer Nature Singapore. 10.1007/978‑981‑16‑0924‑4_31
    https://doi.org/10.1007/978-981-16-0924-4_31 [Google Scholar]
  23. (2022c) The Lannang Corpus (LanCorp): A POS-tagged, sociolinguistic corpus containing recordings and transcriptions of Lannang speech collected from the metropolitan Manila Lannangs between 2016 and 2020. Deep Blue Data, Deep Blue Repositories. The University of Michigan Library. 10.7302/66g9‑e028
    https://doi.org/10.7302/66g9-e028 [Google Scholar]
  24. (2022d) “Truly a Language of Our Own” A Corpus-Based, Experimental, and Variationist Account of Lánnang-uè in Manila [Doctoral dissertation, University of Michigan]. Deep Blue Documents @ University of Michigan. CitetononCRdoi:10.7302/4693
    https://doi.org/Cite to nonCR doi: 10.7302/4693 [Google Scholar]
  25. (2023a) Broadening horizons in the diachronic and sociolinguistic study of Philippine English with the Twitter Corpus of Philippine Englishes (TCOPE). English World-Wide, 44(3), 397–428. 10.1075/eww.22047.gon
    https://doi.org/10.1075/eww.22047.gon [Google Scholar]
  26. (2023b) Spread, stability, and sociolinguistic variation in multilingual practices: The case of Lánnang-uè. International Journal of Multilingualism. Advance online publication. 10.1080/14790718.2023.2199998
    https://doi.org/10.1080/14790718.2023.2199998 [Google Scholar]
  27. (2023c) Variability in clusters and continuums: The sociolinguistic situation of the Manila Lannangs in the 2010s. Asia-Pacific Language Variation, 9(1), 831–124, 10.1075/aplv.22009.gon
    https://doi.org/10.1075/aplv.22009.gon [Google Scholar]
  28. (in press). Mixed language in flux? The various impacts of multilingual contact on Lánnang-uè’s wh-question system. International Journal of Bilingualism.
    [Google Scholar]
  29. Gonzales, W. D. W., & Hiramoto, M.
    (2020) Two Englishes diverged in the Philippines? A substratist account of Manila Chinese English. Journal of Pidgin and Creole Languages, 35(1), 125–159. 10.1075/jpcl.00057.gon
    https://doi.org/10.1075/jpcl.00057.gon [Google Scholar]
  30. Gonzales, W. D. W., Hiramoto, M., Leimgruber, J. R. E., & Lim, J. J.
    (2023) The Corpus of Singapore English Messages (CoSEM). World Englishes, 42(2), 371–388. 10.1111/weng.12534
    https://doi.org/10.1111/weng.12534 [Google Scholar]
  31. Gonzales, W. D. W., & Starr, R. L.
    (2020) Vowel system or vowel systems? Variation in the monophthongs of Philippine Hybrid Hokkien in Manila. Journal of Pidgin and Creole Languages, 35(2), 253–292. 10.1075/jpcl.00061.won
    https://doi.org/10.1075/jpcl.00061.won [Google Scholar]
  32. Hau, C.
    (2014) The Chinese Question: Ethnicity, Nation, and Region in and Beyond the Philippines. NUS Press and Kyoto University Press.
    [Google Scholar]
  33. Haugen, E.
    (1971) The ecology of language. Linguist Report, 13(25), 19–26.
    [Google Scholar]
  34. Hebdige, D.
    (1979) Subculture: The Meaning of Style. Routledge.
    [Google Scholar]
  35. Imao, Y.
    (2022) CasualConc (Version 3.0) [Computer software]. Osaka University. https://sites.google.com/site/casualconc/
    [Google Scholar]
  36. Inoue, A.
    (2008) Copula Variability in Hawai’i Creole [Doctoral dissertation, University of Hawaiʻi at Mānoa]. ScholarSpace @ University of Hawaiʻi at Mānoa. scholarspace.manoa.hawaii.edu/handle/10125/20679
    [Google Scholar]
  37. Klamer, M., & Moro, F. R.
    (2020) What is “natural” speech? Comparing free narratives and Frog stories in Indonesia. Language Documentation, 141, 238–313.
    [Google Scholar]
  38. Klöter, H.
    (2011) The Language of the Sangleys: A Chinese Vernacular in Missionary Sources of the Seventeenth Century. Brill. 10.1163/9789004195929
    https://doi.org/10.1163/9789004195929 [Google Scholar]
  39. Kuznetsova, A., Brockhoff, P. B., & Christhensen, R. H. B.
    (2019) Tests in linear mixed effects models: Package ‘lmerTest’ [Computer software]. https://cran.r-project.org/package=lmerTest
    [Google Scholar]
  40. Labov, W.
    (1972a) Sociolinguistic Patterns. Academic.
    [Google Scholar]
  41. (1972b) Some principles of linguistic methodology. Language in Society, 1(1), 97–120. 10.1017/S0047404500006576
    https://doi.org/10.1017/S0047404500006576 [Google Scholar]
  42. Lafferty, J., McCallum, A., & Pereira, F. C. N.
    (2001) Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data. InC. E. Brodley & A. Pohoreckyj Danyluk (Eds.), Proceedings of the 18th International Conference on Machine Learning, 282–289. www.cs.columbia.edu/~jebara/6772/papers/crf.pdf
    [Google Scholar]
  43. Lausberg, H., & Sloetjes, H.
    (2009) Coding gestural behavior with the NEUROGES-ELAN system. Behavior Research Methods, Instruments, & Computers, 41(3), 841–849. 10.3758/BRM.41.3.841
    https://doi.org/10.3758/BRM.41.3.841 [Google Scholar]
  44. Leimgruber, J., Lim, J. J., Gonzales, W. D. W., & Hiramoto, M.
    (2021) Ethnic and gender variation in the use of Colloquial Singapore English discourse particles. English Language and Linguistics, 25(3), 601–620. 10.1017/S1360674320000453
    https://doi.org/10.1017/S1360674320000453 [Google Scholar]
  45. MacSwan, J.
    (2022) Codeswitching and translanguaging. InS. Mufwene & A. M. Escobar (Eds.), The Cambridge Handbook of Language Contact (pp.90–114). Cambridge University Press. 10.1017/9781009105965.007
    https://doi.org/10.1017/9781009105965.007 [Google Scholar]
  46. Mallinson, C., Childs, B., & Van Herk, G.
    (2017) Data Collection in Sociolinguistics. Routledge. 10.4324/9781315535258
    https://doi.org/10.4324/9781315535258 [Google Scholar]
  47. Nelson, G.
    (2012) International Corpus of English. ice-corpora.net/ice/index.htm
    [Google Scholar]
  48. O’Keeffe, A., & McCarthy, M. J.
    (Eds.) (2022) The Routledge Handbook of Corpus Linguistics (2nd ed.). Routledge. 10.4324/9780367076399
    https://doi.org/10.4324/9780367076399 [Google Scholar]
  49. Philippine Statistics Authority
    Philippine Statistics Authority (2010) The 2010 census of population and housing reveals the Philippine population at 92.34 Million. https://psa.gov.ph/content/2010-census-population-and-housing-reveals-philippine-population-9234-million
    [Google Scholar]
  50. R Core Team
    R Core Team (2023) R: A language and environment for statistical computing (Version 4.3.1) [Computer software]. R Foundation for Statistical Computing. www.R-project.org
    [Google Scholar]
  51. Sharma, D., & Sankaran, L.
    (2011) Cognitive and social forces in dialect shift: Gradual change in London Asian speech. Language Variation and Change, 23(3), 399–428. 10.1017/S0954394511000159
    https://doi.org/10.1017/S0954394511000159 [Google Scholar]
  52. Stabile, C. M.
    (2019) “like, local people doing that”: Variation in the Production and Social Perception of Discourse-pragmatic Like in Pidgin and Hawai‘i English [Doctoral dissertation, University of Hawaiʻi at Mānoa]. ScholarSpace @ University of Hawaiʻi at Mānoa. scholarspace.manoa.hawaii.edu/handle/10125/66256
    [Google Scholar]
  53. Starr, R. L., & Balasubramaniam, B.
    (2019) Variation and change in English /r/ among Tamil Indian Singaporeans. World Englishes, 38(4), 630–643. 10.1111/weng.12357
    https://doi.org/10.1111/weng.12357 [Google Scholar]
  54. Tagliamonte, S.
    (2006) Analysing Sociolinguistic Variation. Cambridge University Press. 10.1017/CBO9780511801624
    https://doi.org/10.1017/CBO9780511801624 [Google Scholar]
  55. Tan, S. V.
    (1993) The Education of Chinese in the Philippines and Koreans in Japan [Unpublished Master’s thesis]. University of Hong Kong. 10.5353/th_b3195041
    https://doi.org/10.5353/th_b3195041 [Google Scholar]
  56. Tan-Gatue, B.
    (1955) The social background of thirty Chinese-Filipino marriages. Philippine Sociological Review, 3(3), 3–13.
    [Google Scholar]
  57. The Lannang Archives
    The Lannang Archives (2020) Lannang Orthography. https://www.lannangarchives.org
    [Google Scholar]
  58. Thomason, S.
    (2007) Language contact and deliberate change. Journal of Language Contact, 1(1), 41–62. 10.1163/000000007792548387
    https://doi.org/10.1163/000000007792548387 [Google Scholar]
  59. Tsai, H.-M.
    (2017) A Study of Philippine Hokkien Language [Unpublished doctoral dissertation]. National Taiwan Normal University.
    [Google Scholar]
  60. Umbal, P.
    (2021) Filipinos Front Too! A Sociophonetic analysis of Toronto English /u/-fronting. American Speech, 96(4), 397–423. 10.1215/00031283‑9116273
    https://doi.org/10.1215/00031283-9116273 [Google Scholar]
  61. Uytanlet, J. L.
    (2014) The Hybrid Tsinoys: Challenges of Hybridity and Homogeneity as Sociocultural Constructs Among the Chinese in the Philippines [Unpublished doctoral dissertation]. Ashbury Theological Seminary.
    [Google Scholar]
  62. Van der Loon, P.
    (1966) The Manila incunabula and early Hokkien studies (part 1). Asia Major, 121, 1–43.
    [Google Scholar]
  63. Van Rossum, G., & Drake, F. L.
    (2009) Python 3 Reference Manual. CreateSpace.
    [Google Scholar]
  64. Wardhaugh, R.
    (2015) An Introduction to Sociolinguistics. Wiley-Blackwell.
    [Google Scholar]
  65. Weisser, M.
    (2016) Practical Corpus Linguistics: An Introduction to Corpus-based Language Analysis. Wiley Blackwell. 10.1002/9781119180180
    https://doi.org/10.1002/9781119180180 [Google Scholar]
  66. Zhu, J., Zhang, C., & Jurgens, D.
    (2022) Phone-to-audio alignment without text: A semi-supervised approach. ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8167–8171. 10.1109/ICASSP43922.2022.9746112
    https://doi.org/10.1109/ICASSP43922.2022.9746112 [Google Scholar]
  67. Zufferey, S.
    (2020) Introduction to Corpus Linguistics. John Wiley and Sons. 10.1002/9781119779728
    https://doi.org/10.1002/9781119779728 [Google Scholar]
  68. Zulueta, J.
    (2007) I “speak Chinese but…”: Code-switching and identity construction among Chinese-Filipino youth. Caligrama, 3(2). 10.11606/issn.1808‑0820.cali.2007.65395
    https://doi.org/10.11606/issn.1808-0820.cali.2007.65395 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error