1887
image of Tracing semantic change in Portuguese
USD
Buy:$35.00 + Taxes

Abstract

Abstract

This study uses word embeddings to investigate the semantic changes underlying the creation of two adversative connectives in Portuguese, and ‘but, however’. For , we chart its development from an original PP formed by a preposition with a causal meaning () and a demonstrative pronoun that referred anaphorically to a previous proposition (()). For , we trace its change from an adverb meaning ‘more’. Adopting a distributional semantics approach, we use word embedding models trained on two corpora, the CIPM (, containing texts from the 12th–16th centuries) and COLONIA (containing texts from the 16th–20th centuries). We produce a measure of change based on the similarity scores of and with respect to words in relevant semantic categories in each corpus, representing the source and the target meanings. This paper, which constitutes the first computational study of semantic change in Portuguese, also discusses challenges and outlines steps to be taken into consideration when choosing embedding algorithms for small historical corpora.

Loading

Article metrics loading...

/content/journals/10.1075/jhl.21028.ama
2022-04-25
2022-05-21
Loading full text...

Full text loading...

References

  1. Antoniak, Maria & David Mimno
    2018 Evaluating the Stability of Embedding-Based Word Similarities. Transactions of the Association for Computational Linguistics6.107–119. 10.1162/tacl_a_00008
    https://doi.org/10.1162/tacl_a_00008 [Google Scholar]
  2. Asr, Fatemeh Torabi, Jon Willits & Michael N. Jones
    2016 Comparing Predictive and Co-Occurrence Based Models of Lexical Semantics Trained on Child-Directed Speech. InProceedings of the 38th Annual Conference of the Cognitive Science Society.
    [Google Scholar]
  3. Bechara, Evanildo
    2009Moderna gramática portuguesa. Rio de Janeiro: Nova Fronteira.
    [Google Scholar]
  4. Bybee, Joan, Revere Perkins & William Pagliuca
    1994The Evolution of Gramar: Tense, Aspect, and Modality in the Languages of the World. Chicago: The University of Chicago Press.
    [Google Scholar]
  5. Castillo Lluch, Mónica
    1993 Acercamiento a las partículas adversativas medievales. Cahiers d’Etudes Hispaniques Médiévales18:1.219–242. 10.3406/cehm.1993.1088
    https://doi.org/10.3406/cehm.1993.1088 [Google Scholar]
  6. Corominas, Joan & José Antonio Pascual
    1980–1991Diccionario crítico etimológico castellano e hispánico, Madrid: Gredos.
    [Google Scholar]
  7. Cuenca, Maria Josep, Sorina Postolea & Jaqueline Visconti
    2019 Contrastive Markers in Contrast. Discours25. 10.4000/discours.10326
    https://doi.org/10.4000/discours.10326 [Google Scholar]
  8. Ducrot, Oswald & Carlos Vogt
    1979 De magis à mais: une hypothèse sémantique. Revue de Linguistique Romane Lyon43:171–172.317–341.
    [Google Scholar]
  9. Eckardt, Regine
    2006Meaning Change in Grammaticalization: An Enquiry into Semantic Reanalysis. Oxford: Oxford University Press. 10.1093/acprof:oso/9780199262601.001.0001
    https://doi.org/10.1093/acprof:oso/9780199262601.001.0001 [Google Scholar]
  10. Espinosa Elorza, Rosa María
    2007 Aspectos generales de la evolución de las expresiones adversativas: Cambios en cadena. Medievalia39.1–30.
    [Google Scholar]
  11. 2018 La formación de los marcadores sumativos en español. Desde sobresto hasta a mayores. Estudios Humanísticos Filología40.95–118. 10.18002/ehf.v0i40.5463
    https://doi.org/10.18002/ehf.v0i40.5463 [Google Scholar]
  12. Forker, Diana
    2016 Toward a Typology for Additive Markers. Lingua180.69–100. 10.1016/j.lingua.2016.03.008
    https://doi.org/10.1016/j.lingua.2016.03.008 [Google Scholar]
  13. Hamilton, William L., Jure Leskovec & Dan Jurafsky
    2016 Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), 1489–1501. 10.18653/v1/P16‑1141
    https://doi.org/10.18653/v1/P16-1141 [Google Scholar]
  14. Harris, Zellig S.
    1954 Distributional Structure. Word10:2–3.146–162. 10.1080/00437956.1954.11659520
    https://doi.org/10.1080/00437956.1954.11659520 [Google Scholar]
  15. Hartmann, Nathan S., Erick R. Fonseca, Christopher D. Shulby, Marcos V. Treviso, Jessica S. Rodrigues & Sandra M. Aluısio
    2017 Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks. InXI Brazilian Symposium in Information and Human Language Technology and Collocated Events.
    [Google Scholar]
  16. Hellrich, Johannes
    2019Word Embeddings: Reliability and Semantic Change. PhD dissertation. Jena University.
    [Google Scholar]
  17. Hellrich, Johannes, Sven Buechel & Udo Hahn
    2019 Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection. InProceedings of the 3rd Joint SIGHUM Workshop on Computation Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 1–11. 10.18653/v1/W19‑2501
    https://doi.org/10.18653/v1/W19-2501 [Google Scholar]
  18. Hofmann, Johann B. & Anton Szantyr
    1965Lateinkche Syntax und Stilistik. Munich: Beck.
    [Google Scholar]
  19. Hu, Hai, Patrícia Amaral & Sandra Kübler
    2021 Word Embeddings and Semantic Shifts in Historical Spanish: Methodological Considerations. Digital Scholarship in the Humanities. 10.1093/llc/fqab050
    https://doi.org/10.1093/llc/fqab050 [Google Scholar]
  20. Huber, Joseph
    1986 [1933]Gramática do Português Antigo. Lisboa: Fundação Calouste Gulbenkian. (Translated byMaria Manuela Gouveia Delille.)
    [Google Scholar]
  21. Jurafsky, Daniel & James H. Martin
    2019Speech and Language Processing. https://web.stanford.edu/~jurafsky/slp3/
    [Google Scholar]
  22. König, Ekkehard
    1989 On the Historical Development of Focus Particles. Sprechen mit Partikelned. byHarald Weydt, 318–329. Berlin: Walter de Gruyter.
    [Google Scholar]
  23. 1991The Meaning of Focus Particles: A Comparative Perspective. London: Routledge.
    [Google Scholar]
  24. 2017 Syntax and Semantics of Additive Focus Markers from a Cross-Linguistic Perspective. Focus on Additivity: Adverbial Modifiers in Romance, Germanic and Slavic Languagesed. byAnna-Maria De Cesare & Cecelia Andorno, 23–44. Amsterdam: John Benjamins. 10.1075/pbns.278.01kon
    https://doi.org/10.1075/pbns.278.01kon [Google Scholar]
  25. König, Ekkehard & Peter Siemund
    2000 Causal and Concessive Clauses: Formal and Semantic Relation. Cause, Condition, Concession, Contrasted. byElizabeth Couper-Kuhlen & Bernd Kortmann, 341–360. Berlin: Mouton de Gruyter. 10.1515/9783110219043.4.341
    https://doi.org/10.1515/9783110219043.4.341 [Google Scholar]
  26. Kutuzov, Andrei, Murhaf Fares, Stephan Oepen & Erik Velldal
    2017 Word Vectors, Reuse, and Replicability: Towards a Community Repository of Large-Text Resources. InProceedings of the 58th Conference on Simulation and Modelling, 271–276. Linköping University Electronic Press.
    [Google Scholar]
  27. Lenci, Alessandro
    2018 Distributional Models of Word Meaning. Annual Review of Linguistics4.151–171. 10.1146/annurev‑linguistics‑030514‑125254
    https://doi.org/10.1146/annurev-linguistics-030514-125254 [Google Scholar]
  28. Levy, Omer, Yoav Goldberg & Ido Dagan
    2015 Improving Distributional Similarity with Lessons Learned from Word Embeddings. Transactions of the Association for Computational Linguistics3.211–225. 10.1162/tacl_a_00134
    https://doi.org/10.1162/tacl_a_00134 [Google Scholar]
  29. Machado, José Pedro
    1952Dicionário etimológico da língua portuguesa: com a mais antiga documentação escrita e reconhecida de muitos dos vocábulos estudados. Lisboa: Editorial Confluência.
    [Google Scholar]
  30. Martelotta, Mário Eduardo Toscano
    2008 Gramaticalização de conectivos portugueses: uma trajetória do espaço para o texto. Estudos Linguísticos/Linguistic Studies2.41–60.
    [Google Scholar]
  31. 2011 The Conclusive Clause in Portuguese: An Approach Combining Grammaticalization Theory and Construction Grammar Theory. Letras & Letras27.1. www.seer.ufu.br/index.php/letraseletras/article/view/25730
    [Google Scholar]
  32. Mattos e Silva, Rosa Virginia
    1984Pero e porém: Mudanças em curso na fase arcaica da língua portuguesa. Boletim de Filologia. LisboaXXIX.129–151.
    [Google Scholar]
  33. 1994O Português Arcaico. Morfologia e sintaxe. São Paulo: Contexto.
    [Google Scholar]
  34. Mauri, Caterina
    2008Coordination Relations in the Languages of Europe and Beyond. Berlin: De Gruyter. 10.1515/9783110211498
    https://doi.org/10.1515/9783110211498 [Google Scholar]
  35. Mauri, Caterina & Anna Giacolone Ramat
    2012 The Development of Adversative Connectives in Italian: Stages and Factors at Play. Linguistics50:2.191–239. 10.1515/ling‑2012‑0008
    https://doi.org/10.1515/ling-2012-0008 [Google Scholar]
  36. Mazzoleni, Marco
    2015 Connettori, grammatica e testi: Ma e (ben) sì tra costrutti avversativi, sostitutivi e preconcessivi. Testualità. Fondamenti, unità, relazioni, 171–188. Florence: Franco Cesati Editore.
    [Google Scholar]
  37. Meyer-Lübke, Wilhelm
    1935Romanisches etymologisches Wörterbuch. Heidelberg: C. Winter.
    [Google Scholar]
  38. 1923Grammaire des langues romanes. New York: Stechert. (Reprint of the 1890–1896 edition byParis: H. Welter).
    [Google Scholar]
  39. Mikolov, Tomas, Kai Chen, Greg Corrado & Jeffrey Dean
    2013 Efficient Estimation of Word Representations in Vector Space. Proceedings of ICLR, Scottsdale, AZ.
    [Google Scholar]
  40. Mira Mateus, Maria Helena, Ana Maria Brito, Inês Duarte & Isabel Hub Faria
    2003Gramática da língua portuguesa. Lisboa: Caminho.
    [Google Scholar]
  41. Neves, Maria Helena de Moura
    1984 O coordenador interfrasal mas – invariância e variantes. Alfa: Revista de Linguística28.21–42.
    [Google Scholar]
  42. Nikolaeva, Irina & Maria Tolskaya
    2001A Grammar of Udihe. Berlin: De Gruyter Mouton. 10.1515/9783110849035
    https://doi.org/10.1515/9783110849035 [Google Scholar]
  43. Orlandini, Anna
    2001Négation et argumentation en latin. Leuven: Peeters Publishers.
    [Google Scholar]
  44. Peres, João & Salvador Mascarenhas
    2006 Notes on Sentential Connections (Predominantly) in Portuguese. Journal of Portuguese Linguistics5.113–169. 10.5334/jpl.156
    https://doi.org/10.5334/jpl.156 [Google Scholar]
  45. Ramat, Anna Giacolone & Caterina Mauri
    2008 From Cause to Contrast: A Study in Semantic Change. Studies on Grammaticalizationed. byElisabeth Verhoeven, Stavros Skopeteas, Yong-Min Shin, Yoko Nishina & Johannes Helmbrecht, 303–320. Berlin: Mouton de Gruyter.
    [Google Scholar]
  46. Raposo, Eduardo Buzaglo Paiva, Maria Fernanda Bacelar do Nascimento, Maria Antónia Coelho da Mota, Luísa Segura & Amália Mendes
    2013Gramática do Português. Lisboa: Fundação Calouste Gulbenkian.
    [Google Scholar]
  47. Robinson, Laura C.
    2008Dupaningan Agta: Grammar, Vocabulary, and Texts. Ph.D. dissertation. University of Hawai’i.
    [Google Scholar]
  48. Rodman, Emma
    2020 A Timely Intervention: Tracking the Changing Meanings of Political Concepts with Word Vectors. Political Analysis28:1.87–111. 10.1017/pan.2019.23
    https://doi.org/10.1017/pan.2019.23 [Google Scholar]
  49. Rodda, Martina A., Marco S. G. Senaldi & Alessandro Lenci
    2017 Panta Rei: Tracking Semantic Change with Distributional Semantics in Ancient Greek. IJCoL. Italian Journal of Computational Linguistics3:3–1.11–24. 10.4000/ijcol.421
    https://doi.org/10.4000/ijcol.421 [Google Scholar]
  50. Rodrigues, João, António Branco, Steven Neale & João Silva
    2016 LX-DSemVectors: Distributional Semantics Models for Portuguese. InComputational Processing of the Portuguese Language. PROPOR 2016. Lecture Notes in Computer Science, vol 9727. ed. byJoão Silva, Ricardo Ribeiro, Paulo Quaresma, André Adami & António Branco, 9–27. Springer, Cham. 10.1007/978‑3‑319‑41552‑9_27
    https://doi.org/10.1007/978-3-319-41552-9_27 [Google Scholar]
  51. Rodrigues, Ruan Chaves, Jéssica Rodrigues, Pedro Vitor Quinta de Castro, Nádia Felix Felipe da Silva & Anderson Soares
    2020 Portuguese Language Models and Word Embeddings: Evaluating on Semantic Similarity Tasks. InComputational Processing of the Portuguese Language. Proceedings of the 14th International Conference, PROPOR 2020, Évora, Portugal, March 2–4, 2020ed. byPaulo Quaresma, Renata Vieira, Sandra Aluísio, Helena Moniz, Fernando Batista & Teresa Gonçalves, 1–23. Cham: Springer. 10.1007/978‑3‑030‑41505‑1_23
    https://doi.org/10.1007/978-3-030-41505-1_23 [Google Scholar]
  52. Rodríguez Somolinos, Amalia
    1996 Pourtant pour autant.: Une évolution divergente. La lingüística francesa: Gramática, historia, epistemología165–174. Grupo Andaluz de Pragmática.
    [Google Scholar]
  53. Rong, Xin
    2014 word2vec Parameter Learning Explained. arXiv preprint
    [Google Scholar]
  54. Sagi, Eyal, Stefan Kaufmann & Brady Clark
    2012 Tracing Semantic Change with Latent Semantic Analysis. Current Methods in Historical Semanticsed. byKathryn Allan and Justyna A. Robinson, 161–183. Berlin: Mouton de Gruyter.
    [Google Scholar]
  55. Sahlgren, Magnus & Alessandro Lenci
    2016 The Effects of Data Size and Frequency Range on Distributional Semantic Models. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 975–980. 10.18653/v1/D16‑1099
    https://doi.org/10.18653/v1/D16-1099 [Google Scholar]
  56. Ali, Manuel Said
    1971Gramática Histórica da Língua Portuguesa. São Paulo: Melhoramentos.
    [Google Scholar]
  57. Schlechtweg, Dominik, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky & Nina Tahmasebi
    2020SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. InProceedings of the Fourteenth Workshop on Semantic Evaluation, 1–23.
    [Google Scholar]
  58. Silva, Tatiana Mazza da
    2010 Gramaticalização de juntivos adversativos na história do Português. MA thesis. São José do Rio Preto. 10.5016/DT000623067
    https://doi.org/10.5016/DT000623067 [Google Scholar]
  59. Silveira Bueno, Francisco da
    1963Grande dicionário etimológico-prosódico da língua portuguesa. São Paulo: Edição Saraiva.
    [Google Scholar]
  60. Stilo, Donald
    2004 Coordination in Three Western Iranian Languages: Vafsi, Persian and Gilaki. Coordinating Constructionsed. byMartin Haspelmath, 269–330. Amsterdam: John Benjamins. 10.1075/tsl.58.16sti
    https://doi.org/10.1075/tsl.58.16sti [Google Scholar]
  61. Tahmasebi, Nina, Lars Borin & Adam Jatowt
    2018 Survey of Computational Approaches to Lexical Semantic Change. arXiv:1811.06278v2 [cs.CL]
    [Google Scholar]
  62. Tang, Xuri
    2018 A State-of-the-Art of Semantic Change Computation. Natural Language Engineering24:5.649–676. 10.1017/S1351324918000220
    https://doi.org/10.1017/S1351324918000220 [Google Scholar]
  63. Tosco, Mauro
    2010 Why Contrast Matters: Information Structure in Gawwada (East Cushitic). The Expression of Information Structure: A Documentation of its Diversity Across Africaed. byInes Fileder & Anne Schwarz315–347. Amsterdam: John Benjamins. 10.1075/tsl.91.12tos
    https://doi.org/10.1075/tsl.91.12tos [Google Scholar]
  64. Traugott, Elizabeth C. & Richard B. Dasher
    2001Regularity in Semantic Change. Cambridge: Cambridge University Press. 10.1017/CBO9780511486500
    https://doi.org/10.1017/CBO9780511486500 [Google Scholar]
  65. Tsakalidis, Adam, Marya Bazzi, Mihai Cucuringu, Pierpaolo Basile & Barbara McGillivray
    2019 Mining the UK Web Archive for Semantic Change Detection. InProceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), 1212–1221. 10.26615/978‑954‑452‑056‑4_139
    https://doi.org/10.26615/978-954-452-056-4_139 [Google Scholar]
  66. von Fintel, Kai
    1995 “The Formal Semantics of Grammaticalization.” Proceedings of the North East Linguistics Society 25 – Volume Two: Papers from the Workshops on Language Acquisition & Language Change, Article 14.
    [Google Scholar]
  67. Zampieri, Marcos, Shervin Malmasi & Mark Dras
    2016 Modeling Language Change in Historical Corpora: The Case of Portuguese. InProceedings of Language Resources and Evaluation (LREC), 4098–4104.
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journals/10.1075/jhl.21028.ama
Loading
/content/journals/10.1075/jhl.21028.ama
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error