Volume 26, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



This paper examines the use of the three non-periphrastic subjunctives in Spanish in embedded clauses under obligatory subjunctive predicates in the past tense in three Spanish varieties: Argentinean, Mexican and Peninsular Spanish. By means of random forest and logistic regression analyses, I demonstrate that a grammar where the two “past” subjunctives make up one group, such that the variation can be modeled on a binary opposition between (morphologically) vs. (morphologically) , achieves better prediction accuracy and goodness-of-fit parameters than a grammar with a three-way split. The results suggest that, at least in complement clauses of obligatory subjunctive predicates, there appear to be no semantic differences between the two past subjunctives but there are still relatively large differences in how the three subjunctive forms are used across the three Spanish varieties studied.1


Article metrics loading...

Loading full text...

Full text loading...


  1. Baayen, R. H.
    (2010) Demythologizing the word frequency effect: A discriminative learning perspective. The Mental Lexicon, 5(3). 436–461. 10.1075/ml.5.3.10baa
    https://doi.org/10.1075/ml.5.3.10baa [Google Scholar]
  2. Baayen, R. H., Hendrix, P., & Ramscar, M.
    (2011a, January6–9). Sidestepping the combinatorial explosion: Towards a processing model based on discriminative learning [Paper presentation]. Annual Meeting of the Linguistic Society of America. Pittsburgh, USA.
    [Google Scholar]
  3. Baayen, R. H., Milin, P., Filipovic Durffevic, D., Hendrix, P., & Marelli, M.
    (2011b) An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review, 118(3). 438. 10.1037/a0023851
    https://doi.org/10.1037/a0023851 [Google Scholar]
  4. Baayen, R. H., Endresen, A., Janda, L. A., Makarova, A., & Nesset, T.
    (2013) Making choices in Russian: Pros and cons of statistical methods for rival forms. Russian Linguistics, 37(3), 253–291. 10.1007/s11185‑013‑9118‑6
    https://doi.org/10.1007/s11185-013-9118-6 [Google Scholar]
  5. Bates, D., Mächler, M., Bolker, B., & Walker, S.
    (2014) Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48.
    [Google Scholar]
  6. Bello, A., & Cuervo, R. J.
    (1970) Gramática de la lengua castellana [A Grammar of the Spanish Language]. Sopena Argentina.
    [Google Scholar]
  7. Branco, P., Ribeiro, R. P., & Torgo, L.
    (2016) UBL: An R Package for Utility-Based Learning [Computer software]. arxiv.org/abs/1604.08079
    [Google Scholar]
  8. Breiman, L.
    (2001) Random forests. Machine Learning, 45(1), 5–32. 10.1023/A:1010933404324
    https://doi.org/10.1023/A:1010933404324 [Google Scholar]
  9. Bybee, J.
    (1985) Morphology: A Study of the Relation between Meaning and Form. John Benjamins. 10.1075/tsl.9
    https://doi.org/10.1075/tsl.9 [Google Scholar]
  10. Bybee, J., & Thompson, S.
    (2000) Three frequency effects in syntax. Berkeley Linguistics Society, 23(1), 378–388. 10.3765/bls.v23i1.1293
    https://doi.org/10.3765/bls.v23i1.1293 [Google Scholar]
  11. Carrasco Gutierrez, A.
    (1998) La correlación de tiempos en español [Sequence of Tense in Spanish]. Universidad Complutense de Madrid dissertation.
    [Google Scholar]
  12. Chawla, N. V., Japkowicz, N., & Kotcz, A.
    (2004) Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1), 1–6. 10.1145/1007730.1007733
    https://doi.org/10.1145/1007730.1007733 [Google Scholar]
  13. Comrie, B.
    (1985) Tense. Cambridge University Press. 10.1017/CBO9781139165815
    https://doi.org/10.1017/CBO9781139165815 [Google Scholar]
  14. Crespo del Río, C.
    (2014) Tense and Mood Cariation in Spanish Nominal Subordinates: The Case of Peruvian Varieties [Doctoral dissertation, University of IllinoisatUrbana-Champaign). IDEALS. hdl.handle.net/2142/49477
    [Google Scholar]
  15. Davies, M.
    (2016) Corpus del Español/ Web Dialects 2 billion words. Available online atwww.corpusdelespanol.org
    [Google Scholar]
  16. Davis, J., & Goadrich, M.
    (2006) The relationship between Precision-Recall and ROC curves. InProceedings of the 23rd International Conference on Machine Learning (pp.233–240). ACM. doi:  10.1145/1143844.1143874
    https://doi.org/10.1145/1143844.1143874 [Google Scholar]
  17. Day, M.
    (2011 June, 21). Variation in the use of the –ra and –se forms of the imperfect subjunctive in Modern Spoken Peninsular Spanish [Paper presentation]. NWAV 40, Georgetown University.
    [Google Scholar]
  18. Debeer, D., & Strobl, C.
    (2019) permimp: (Conditional) Permutation Importance (R package version 0.1–01) [Computer software]. https://github.com/ddebeer/permimp
    [Google Scholar]
  19. DeMello, G.
    (1993) –ra vs. –se subjunctive: A new look at an old topic. Hispania, 76(2), 235–243. 10.2307/344667
    https://doi.org/10.2307/344667 [Google Scholar]
  20. Fox, J.
    (1987) Effect displays for generalized linear models. Sociological Methodology, 17, 347–361. 10.2307/271037
    https://doi.org/10.2307/271037 [Google Scholar]
  21. (2003) Effect displays in R for generalised linear models. Journal of Statistical Software, 8(15), 1–27. 10.18637/jss.v008.i15
    https://doi.org/10.18637/jss.v008.i15 [Google Scholar]
  22. Fox, J., & Hong, J.
    (2009) Effect displays in R for multinomial and proportional-odds logit models: Extensions to the effects package. Journal of Statistical Software, 32(1), 1–24. 10.18637/jss.v032.i01
    https://doi.org/10.18637/jss.v032.i01 [Google Scholar]
  23. Fox, J., & Weisberg, S.
    (2018) Visualizing fit and lack of fit in complex regression models with predictor effect plots and partial residuals. Journal of Statistical Software, 87(9), 1–27. 10.18637/jss.v087.i09
    https://doi.org/10.18637/jss.v087.i09 [Google Scholar]
  24. (2019) An R Companion to Applied Regression (3rd ed.). Sage. tinyurl.com/carbook
    [Google Scholar]
  25. García, V., Mollineda, R. A., & Sánchez, J. S.
    (2010) Theoretical analysis of a performance measure for imbalanced data. In20th International Conference on Pattern Recognition (pp.617–620). IEEE. https://ieeexplore.ieee.org/document/5597459. 10.1109/ICPR.2010.156
    https://doi.org/10.1109/ICPR.2010.156 [Google Scholar]
  26. Gili Gaya, S.
    (1983) Curso superior de sintaxis española [Advanced Course on Spanish Syntax]. Colton Book Imports.
    [Google Scholar]
  27. Goldberg, A. E.
    (1995) Constructions: A Construction Grammar Approach to Argument Structure. The University of Chicago Press.
    [Google Scholar]
  28. Guajardo, G., & Goodall, G.
    (2019) On the status of concordantia temporum in Spanish: An experimental approach. Glossa, 4(1), 116. 10.5334/gjgl.749
    https://doi.org/10.5334/gjgl.749 [Google Scholar]
  29. He, H., Bai, Y., Garcia, E. A., & Li, S.
    (2008) ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322–1328. IEEE. https://t.ly/w1uy
    [Google Scholar]
  30. He, H. & Garcia, E. A.
    (2009) Learning from imbalanced data. IEEE Transactions on Knowledge & Data Engineering, 21(9), 1263–1284. 10.1109/TKDE.2008.239
    https://doi.org/10.1109/TKDE.2008.239 [Google Scholar]
  31. Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., & Van Der Laan, M. J.
    (2005) Survival ensembles. Biostatistics, 7(3), 355–373. 10.1093/biostatistics/kxj011
    https://doi.org/10.1093/biostatistics/kxj011 [Google Scholar]
  32. Krawczyk, B.
    (2016) Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221–232. 10.1007/s13748‑016‑0094‑0
    https://doi.org/10.1007/s13748-016-0094-0 [Google Scholar]
  33. Laca, B.
    (2010) The puzzle of subjunctive tenses. InR. Box-Bennema, B. Kampers-Manhe, & B. Hollebrandse (Eds.), Romance Languages and Linguistic Theory 2008: Selected Papers from ‘Going Romance’ Groningen 2008 (pp.77–104). John Benjamins. 10.1075/rllt.2.10lac
    https://doi.org/10.1075/rllt.2.10lac [Google Scholar]
  34. Lapesa, R.
    (1997) Historia de la Lengua Española [History of the Spanish Language]. Biblioteca Románica Hispánica.
    [Google Scholar]
  35. Lathrop, T. A.
    (1980) The Evolution of Spanish. Juan de la Cuesta.
    [Google Scholar]
  36. Lopez Samaniego, A., & Kempas, I.
    (2018) Querría que me lo compruebes/comprobaras/comprobases: Verb tense choice after expressions of attenuated volition in European Spanish. Estudios Filologicos, 61, 35–58.
    [Google Scholar]
  37. López, V., Fernández, A., García, S., Palade, V., & Herrera, F.
    (2013) An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250, 113–141. 10.1016/j.ins.2013.07.007
    https://doi.org/10.1016/j.ins.2013.07.007 [Google Scholar]
  38. Lunn, P. V.
    (1995) The evaluative function of the Spanish subjunctive. InJ. Bybee & S. Fleischman (Eds.), Modality in Grammar and Discourse (pp.429–449). John Benjamins. 10.1075/tsl.32.18lun
    https://doi.org/10.1075/tsl.32.18lun [Google Scholar]
  39. Naranjo, M. G.
    (2017) The se-ra alternation in Spanish subjunctive. Corpus Linguistics and Linguistic Theory, 13(1), pp.97–134.
    [Google Scholar]
  40. Olson, D. L., & Delen, D.
    (2008) Performance evaluation for predictive modeling. InAdvanced Data Mining Techniques (pp.137–147). Springer. 10.1007/978‑3‑540‑76917‑0_9
    https://doi.org/10.1007/978-3-540-76917-0_9 [Google Scholar]
  41. Penny, R.
    (1991) A History of the Spanish Language. Cambridge University Press.
    [Google Scholar]
  42. Picallo, C.
    (1984) “El nudo FLEX y el parámetro del sujeto nulo” [The IP and pro-drop parameter]. InI. Bosque (Ed), Indicativo y subjuntivo [Indicative and Subjunctive] (pp.202–233). Taurus.
    [Google Scholar]
  43. Provost, F.
    (2000) Machine learning from imbalanced data sets 101. InProceedings of the AAAI’2000 Workshop on Imbalanced Data Sets. AAAI Press. https://www.aaai.org/Papers/Workshops/2000/WS-00-05/WS00-05-001.pdf
    [Google Scholar]
  44. Quer, J.
    (1998) Mood at the Interface. Holland Academic Graphics.
    [Google Scholar]
  45. R Core Team
    R Core Team (2019) R: A language and environment for statistical computing (Version 3.6.1) [Computer software]. R Foundation for Statistical Computing. www.R-project.org/
    [Google Scholar]
  46. Raeder, T., Forman, G., & Chawla, N. V.
    (2012) Learning from imbalanced data: Evaluation matters. InD. E. Holmes & J. C. Lakhmi (Eds.), Data Mining: Foundations and Intelligent Paradigms (pp.315–331). Springer. 10.1007/978‑3‑642‑23166‑7_12
    https://doi.org/10.1007/978-3-642-23166-7_12 [Google Scholar]
  47. Rosemeyer, M., & Schwenter, S. A.
    (2019) Entrenchment and persistence in language change: The Spanish past subjunctive. Corpus Linguistics and Linguistic Theory, 15(1), 167–204. 10.1515/cllt‑2016‑0047
    https://doi.org/10.1515/cllt-2016-0047 [Google Scholar]
  48. Sessarego, S.
    (2008) Spanish concordantia temporum: An old issue, new solutions. InM. Westmoreland & J. A. Thomas (Eds.), Selected Proceedings of the 4th Workshop on Spanish Sociolinguistics (pp.91–99). Cascadilla Proceedings Project. www.lingref.com/cpp/wss/4/paper1759.pdf
    [Google Scholar]
  49. (2010) Temporal concord and Latin American Spanish dialects: A genetic blueprint. Revista Iberoamericana de Lingüística, 5, 137–169.
    [Google Scholar]
  50. Strobl, C., Boulesteix, A. L., Zeileis, A., & Hothorn, T.
    (2007) Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics, 8(1). 10.1186/1471‑2105‑8‑25
    https://doi.org/10.1186/1471-2105-8-25 [Google Scholar]
  51. Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A.
    (2008) Conditional variable importance for random forests. BMC Bioinformatics, 9(1). 10.1186/1471‑2105‑9‑307
    https://doi.org/10.1186/1471-2105-9-307 [Google Scholar]
  52. Suñer, M., & Padilla-Rivera, J.
    (1987) Sequence of tenses and the subjunctive. Hispania, 70(3), 634–642. 10.2307/343448
    https://doi.org/10.2307/343448 [Google Scholar]
  53. Tharwat, A.
    (2020) Classification of assessment methods. Applied Computing and Informatics. Advance online publication. doi:  10.1016/j.aci.2018.08.003
    https://doi.org/10.1016/j.aci.2018.08.003 [Google Scholar]
  54. Venables, W. N., & Ripley, B. D.
    (2002) Random and mixed effects. InModern Applied Statistics with S (pp.271–300). Springer. 10.1007/978‑0‑387‑21706‑2_10
    https://doi.org/10.1007/978-0-387-21706-2_10 [Google Scholar]
  55. Wallace, B. C., & Dahabreh, I. J.
    (2012) Class probability estimates are unreliable for imbalanced data (and how to fix them). InInstitute of Electrical and Electronics Engineers (IEEE) 12th International Conference on Data Mining (International Conference on Data Mining) (pp.695–704). IEEE Computer Society. 10.1109/ICDM.2012.115
    https://doi.org/10.1109/ICDM.2012.115 [Google Scholar]
  56. Wurmbrand, S.
    (2014) Tense and aspect in English infinitives. Linguistic Inquiry, 45(3), 403–447. 10.1162/LING_a_00161
    https://doi.org/10.1162/LING_a_00161 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error