1887
Volume 8, Issue 2
  • ISSN 2542-3835
  • E-ISSN: 2542-3843

Abstract

Abstract

Poor sampling practices can constitute a questionable research practice when conducting L2 inferential quantitative research. The current study, a methodological synthesis ( = 433 Scopus/Web of Science (WoS) reports: cluster random sampling) of sampling practices, revealed that L2 inferential quantitative researchers rarely employed randomized and/or effect size-driven sampling processes with only eight (1.8%) and ten (2.3%) of the reports being respectively satisfactory. Furthermore, just 33.9% of the reports featured multisite (convenience) samples. In models assessing what predicted multisite sampling, whether the report was ISLA-focused ( = −.33,  < .001) or single-authored ( = −.15,  < .001) incurred moderate and weak negative associations. Citation analysis metric values and the Scopus/WoS contrast had no associations. The findings of this study suggest the field’s sampling practices have room to improve and guidance for future improvement is offered.

Available under the CC BY-NC 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/jsls.00049.vit
2025-08-18
2026-03-11
Loading full text...

Full text loading...

/deliver/fulltext/jsls.00049.vit.html?itemId=/content/journals/10.1075/jsls.00049.vit&mimeType=html&fmt=ahah

References

  1. Al-Hoorie, A. H., & Vitta, J. P.
    (2019) The seven sins of L2 research: A review of 30 journals’ statistical quality and their CiteScore, SJR, SNIP, JCR Impact Factors. Language Teaching Research, 23(6), 727–744. 10.1177/1362168818767191
    https://doi.org/10.1177/1362168818767191 [Google Scholar]
  2. Al-Hoorie, A. H., Oga-Baldwin, W. L. Q., Hiver, P., & Vitta, J. P.
    (2022) Self-determination mini-theories in second language learning: A systematic review of three decades of research. Language Teaching Research, 29(4), 1603–1638. 10.1177/13621688221102686
    https://doi.org/10.1177/13621688221102686 [Google Scholar]
  3. Bell, A., Fairbrother, M., & Jones, K.
    (2019) Fixed and random effects models: making an informed choice. Quality & Quantity, 53(2), 1051–1074. 10.1007/s11135‑018‑0802‑x
    https://doi.org/10.1007/s11135-018-0802-x [Google Scholar]
  4. Berkopec, A.
    (2007) HyperQuick algorithm for discrete hypergeometric distribution. Journal of Discrete Algorithms, 5(2), 341–347. 10.1016/j.jda.2006.01.001
    https://doi.org/10.1016/j.jda.2006.01.001 [Google Scholar]
  5. Brysbaert, M.
    (2019) How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2(1), 1–38. 10.5334/joc.72
    https://doi.org/10.5334/joc.72 [Google Scholar]
  6. Brysbaert, M., & Stevens, M.
    (2018) Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 1–20. 10.5334/joc.10
    https://doi.org/10.5334/joc.10 [Google Scholar]
  7. Cohen, J.
    (1988) Statistical power analysis for the behavioral sciences. Routledge. 10.4324/9780203771587
    https://doi.org/10.4324/9780203771587 [Google Scholar]
  8. Del Mar Suárez, M., Gilabert, R., & Moskvina, N.
    (2021) The mediating role of vocabulary size, working memory, attention and inhibition in early vocabulary learning under different TV genres: An exploratory study. TESOL Journal, 12(4). 10.1002/tesj.637
    https://doi.org/10.1002/tesj.637 [Google Scholar]
  9. Dellinger, J.
    (2017) Correlation, Spearman. InM. Allen (Ed.), The SAGE encyclopedia of communication research methods (pp.274–275). Sage.
    [Google Scholar]
  10. Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A.
    (2007) G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. 10.3758/BF03193146
    https://doi.org/10.3758/BF03193146 [Google Scholar]
  11. Farsani, M. A., & Babaii, E.
    (2020) Applied linguistics research in three decades: A methodological synthesis of graduate theses in an EFL context. Quality & Quantity, 541, 1257–1283. 10.1007/s11135‑020‑00984‑w
    https://doi.org/10.1007/s11135-020-00984-w [Google Scholar]
  12. Fazilatfar, A. M., Kasiri, F., & Nowbakht, M.
    (2020) The comparative effects of planning time and task conditions on the complexity, accuracy and fluency of L2 writing by EFL Learners. Iranian Journal of Language Teaching Research, 8(1), 93–110. 10.30466/ijltr.2020.120809
    https://doi.org/10.30466/ijltr.2020.120809 [Google Scholar]
  13. Feinstein, A. R.
    (1998) P-Values and confidence intervals: two sides of the same unsatisfactory coin. Journal of Clinical Epidemiology, 51(4), 355–360. 10.1016/S0895‑4356(97)00295‑3
    https://doi.org/10.1016/S0895-4356(97)00295-3 [Google Scholar]
  14. Fienberg, S. E., & Tanur, J. M.
    (1996) Reconsidering the fundamental contributions of Fisher and Neyman on experimentation and sampling. International Statistical Review, 64(3), 237–253. 10.2307/1403784
    https://doi.org/10.2307/1403784 [Google Scholar]
  15. Fisher, R. A.
    (1925) Statistical methods for research workers. Oliver and Boyd.
    [Google Scholar]
  16. Folse, K. S.
    (2006) The effect of type of written exercise on L2 vocabulary retention. TESOL Quarterly, 401, 273–293. 10.2307/40264523
    https://doi.org/10.2307/40264523 [Google Scholar]
  17. Freund, R. J., Wilson, W. J., & Mohr, D. L.
    (2010) Statistical methods (3rd Edition). Academic Press: Elsevier.
    [Google Scholar]
  18. Fukuta, J., Nishimura, Y., & Tamura, Y.
    (2023) Pitfalls of production data analysis for investigating L2 cognitive mechanism: An ontological realism perspective. Journal of Second Language Studies, 6(1), 95–118. 10.1075/jsls.21013.fuk
    https://doi.org/10.1075/jsls.21013.fuk [Google Scholar]
  19. Gass, S., Loewen, S., & Plonsky, L.
    (2021) Coming of age: The past, present, and future of quantitative SLA research. Language Teaching, 54(2), 245–258. 10.1017/S0261444819000430
    https://doi.org/10.1017/S0261444819000430 [Google Scholar]
  20. Gelman, A., Hill, J., & Vehtari, A.
    (2022) Regression and other stories. Cambridge University Press.
    [Google Scholar]
  21. Glass, G. V.
    (1965) A ranking variable analogue of biserial correlation: Implications for short-cut item analysis. Journal of Educational Measurement, 2(1), 91–95. 10.1111/j.1745‑3984.1965.tb00396.x
    https://doi.org/10.1111/j.1745-3984.1965.tb00396.x [Google Scholar]
  22. Harter, R.
    (2008) Random sampling. InP. Lavrakas (Ed.), Encyclopedia of survey research methods (pp.683–684). SAGE Publications. 10.4135/9781412963947.n440
    https://doi.org/10.4135/9781412963947.n440 [Google Scholar]
  23. Hattie, J. A. C.
    (2009) Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
    [Google Scholar]
  24. Hirosh, Z., & Degani, T.
    (2021) Novel word learning among bilinguals can be better through the (dominant) first language than through the second language. Language Learning, 71(4), 1044–1084. 10.1111/lang.12457
    https://doi.org/10.1111/lang.12457 [Google Scholar]
  25. Hiver, P., & Al-Hoorie, A. H.
    (2020) Reexamining the role of vision in second language motivation: A preregistered conceptual replication of You, Dörnyei, and Csizér (2016). Language Learning, 70(1), 48–102, 10.1111/lang.12371
    https://doi.org/10.1111/lang.12371 [Google Scholar]
  26. Hiver, P., Al-Hoorie, A. H., Vitta, J. P., & Wu, J.
    (2024) Engagement in language learning: A systematic review of 20 years of research methods and definitions. Language Teaching Research, 28(1), 201–230. 10.1177/13621688211001289
    https://doi.org/10.1177/13621688211001289 [Google Scholar]
  27. Hu, Y., & Plonsky, L.
    (2021) Statistical assumptions in L2 research: A systematic review. Second Language Research, 37(1), 171–184. 10.1177/0267658319877433
    https://doi.org/10.1177/0267658319877433 [Google Scholar]
  28. Huensch, A., & Nagle, C.
    (2021) The effect of speaker proficiency on intelligibility, comprehensibility, and accentedness in L2 Spanish: A conceptual replication and extension of Munro and Derwing (1995a). Language Learning, 71(3), 626–668. 10.1111/lang.12451
    https://doi.org/10.1111/lang.12451 [Google Scholar]
  29. Hung, H.
    (2017) Design-based research: Redesign of an English language course using a flipped classroom approach. TESOL Quarterly, 51(1), 180–192. 10.1002/tesq.328
    https://doi.org/10.1002/tesq.328 [Google Scholar]
  30. JASP Team
    JASP Team (2023) JASP (Version 0.18.0) [Computer software]. https://jasp-stats.org/team/
    [Google Scholar]
  31. Jo, C. W.
    (2021) Short vs. extended adolescent academic writing: A cross-genre analysis of writing skills in written definitions and persuasive essays. Journal of English for Academic Purposes, 531, 101014. 10.1016/j.jeap.2021.101014
    https://doi.org/10.1016/j.jeap.2021.101014 [Google Scholar]
  32. Joy, R., Schulz, H., FitzPatrick, B., & Hancock, S.
    (2021) English Language Arts performance of Grade 6 students in an intensive French program. Canadian Modern Language Review/ La Revue Canadienne Des Langues Vivantes, 77(1), 23–45. 10.3138/cmlr‑2019‑0039
    https://doi.org/10.3138/cmlr-2019-0039 [Google Scholar]
  33. Jylkkä, J., Soveri, A., Laine, M., & Lehtonen, M.
    (2020) Assessing bilingual language switching behavior with Ecological Momentary Assessment. Bilingualism: Language and Cognition, 23(2), 309–322. 10.1017/S1366728918001190
    https://doi.org/10.1017/S1366728918001190 [Google Scholar]
  34. Kruschke, J. K.
    (2015) Doing Bayesian data analysis: A tutorial with R and BUGS (2nd ed.). Academic Press.
    [Google Scholar]
  35. Lakens, D.
    (2022) Sample size justification. Collabra: Psychology, 8(1), Article 33267. 10.1525/collabra.33267
    https://doi.org/10.1525/collabra.33267 [Google Scholar]
  36. Larsson, T., Plonsky, L., Sterling, S., Kytö, M., Yaw, K., & Wood, M.
    (2024) On the frequency, prevalence, and perceived severity of questionable research practices. Research Methods in Applied Linguistics, 2(3), 100064. 10.1016/j.rmal.2023.100064
    https://doi.org/10.1016/j.rmal.2023.100064 [Google Scholar]
  37. Lee, S., Choe, H., Zou, D., & Jeon, J.
    (2025) Generative AI (GenAI) in the language classroom: A systematic review. Interactive Learning Environments, 1–25. 10.1080/10494820.2025.2498537
    https://doi.org/10.1080/10494820.2025.2498537 [Google Scholar]
  38. Lei, S., & Yang, R.
    (2020) Lexical richness in research articles: Corpus-based comparative study among advanced Chinese learners of English, English native beginner students and experts. Journal of English for Academic Purposes, 471, 100894. 10.1016/j.jeap.2020.100894
    https://doi.org/10.1016/j.jeap.2020.100894 [Google Scholar]
  39. Lindstromberg, S.
    (2016) Inferential statistics in Language Teaching Research: A review and ways forward. Language Teaching Research, 20(6), 741–768. 10.1177/1362168816649979
    https://doi.org/10.1177/1362168816649979 [Google Scholar]
  40. (2023) The winner’s curse and related perils of low statistical power − spelled out and illustrated. Research Methods in Applied Linguistics, 2(3), 100059. 10.1016/j.rmal.2023.100059
    https://doi.org/10.1016/j.rmal.2023.100059 [Google Scholar]
  41. Loewen, S.
    (2015) Instructed second language acquisition. Routledge.
    [Google Scholar]
  42. Loewen, S., & Hui, B.
    (2021) Small samples in instructed second language acquisition research. The Modern Language Journal, 105(1), 187–193. 10.1111/modl.12700
    https://doi.org/10.1111/modl.12700 [Google Scholar]
  43. Madya, S., Retnawati, H., Purnawan, A., Putro, N. H. P. S., & Kartianom, K.
    (2020) The range of TOEFL scores predicted by TOEP. Indonesian Journal of Applied Linguistics, 10(2), 491–501. 10.17509/ijal.v10i2.28591
    https://doi.org/10.17509/ijal.v10i2.28591 [Google Scholar]
  44. Moranski, K., & Ziegler, N.
    (2021) A case for multisite second language acquisition research: Challenges, risks, and rewards. Language Learning, 71(1), 204–242. 10.1111/lang.12434
    https://doi.org/10.1111/lang.12434 [Google Scholar]
  45. Morgan-Short, K., Marsden, E., Heil, J., Issa II, B. I., Leow, R. P., Mikhaylova, A., Mikołajczak, S., Moreno, N., Slabakova, R. and Szudarski, P.
    (2018) Multisite replication in second language acquisition research: Attention to form during listening and reading comprehension. Language Learning, 68(2), 392–437. 10.1111/lang.12292
    https://doi.org/10.1111/lang.12292 [Google Scholar]
  46. Newcombe, R. G.
    (1998) Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine, 17(8), 857–872. 10.1002/(SICI)1097‑0258(19980430)17:8<857::AID‑SIM777>3.0.CO;2‑E
    https://doi.org/10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E [Google Scholar]
  47. Neyman, J., & Pearson, E. S.
    (1933) On the problems of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London, 231A, 289–338. 10.1098/rsta.1933.0009
    https://doi.org/10.1098/rsta.1933.0009 [Google Scholar]
  48. Neyman, J.
    (1937) Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London Series a Mathematical and Physical Sciences, 236(767), 333–380. 10.1098/rsta.1937.0005
    https://doi.org/10.1098/rsta.1937.0005 [Google Scholar]
  49. Nicklin, C., McLean, S., & Vitta, J. P.
    (2025) Contrasting fixed- and mixed-effects modeling in vocabulary research: Reanalyzing Laufer (2024) and McLean et al. (2020). Language Learning, Advanced Online Publication. 10.1111/lang.12715
    https://doi.org/10.1111/lang.12715 [Google Scholar]
  50. Norouzian, R.
    (2020) Sample size planning in quantitative L2 research: A pragmatic approach. Studies in Second Language Acquisition, 42(4), 849–870. 10.1017/S0272263120000017
    https://doi.org/10.1017/S0272263120000017 [Google Scholar]
  51. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D.
    (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ, n71. 10.1136/bmj.n71
    https://doi.org/10.1136/bmj.n71 [Google Scholar]
  52. Perugini, M., Gallucci, M., & Costantini, G.
    (2018) A practical primer to power analysis for simple experimental designs. International Review of Social Psychology, 311, 1–20. 10.5334/irsp.181
    https://doi.org/10.5334/irsp.181 [Google Scholar]
  53. Plonsky, L.
    (2013) Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687. 10.1017/S0272263113000399
    https://doi.org/10.1017/S0272263113000399 [Google Scholar]
  54. (2023) Sampling and generalizability in Lx research: A second order synthesis. Languages, 81, 75. 10.3390/languages8010075
    https://doi.org/10.3390/languages8010075 [Google Scholar]
  55. Plonsky, L., & Gass, S.
    (2011) Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61(2), 325–366. 10.1111/j.1467‑9922.2011.00640.x
    https://doi.org/10.1111/j.1467-9922.2011.00640.x [Google Scholar]
  56. Plonsky, L., Larsson, T., Sterling, S., Kytö, M., Yaw, K., & Wood, M.
    (2024) A taxonomy of questionable research practices in quantitative humanities. InP. I. De Costa, A. Rabie-Ahmed, & C. Cinaglia (Eds.), Ethical issues in applied linguistics scholarship. John Benjamins. 10.1075/rmal.7.01plo
    https://doi.org/10.1075/rmal.7.01plo [Google Scholar]
  57. Shatz, I.
    (2024) Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics. Behavior Research Methods. Advance online publication. 10.3758/s13428‑023‑02072‑x
    https://doi.org/10.3758/s13428-023-02072-x [Google Scholar]
  58. Shavelson, R. J., & Webb, N. M.
    (1991) Generalizability theory: A primer. Sage Publications.
    [Google Scholar]
  59. Simons, D. J., Shoda, Y., & Lindsay, D. S.
    (2017) Constraints on Generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12(6), 1123–1128. 10.1177/1745691617708630
    https://doi.org/10.1177/1745691617708630 [Google Scholar]
  60. Student
    Student (1908) The probable error of a mean. Biometrika, 6(1), 1–25. 10.2307/2331554
    https://doi.org/10.2307/2331554 [Google Scholar]
  61. Sudina, E., & Plonsky, L.
    (2024) The effects of frequency, duration, and intensity on L2 learning through Duolingo: A natural experiment. Journal of Second Language Studies, 7(1), 1–43. 10.1075/jsls.00021.plo
    https://doi.org/10.1075/jsls.00021.plo [Google Scholar]
  62. Vitta, J. P., & Al-Hoorie, A. H.
    (2021) Measurement and sampling recommendations for L2 flipped learning experiments: A bottom-up methodological synthesis. The Journal of AsiaTEFL, 18(2), 682–692. 10.18823/asiatefl.2021.18.2.23.682
    https://doi.org/10.18823/asiatefl.2021.18.2.23.682 [Google Scholar]
  63. Vitta, J. P., Hahn, A., & Nicklin, C.
    (2023a, March). Exploring the sampling crisis in L2 quantitative research: A predictive model and future directions [Paper presentation]. American Association for Applied Linguistics 2023 Annual Conference, Portland, OR. 10.17605/OSF.IO/ERFB7 (paper content downloadable from OSF)
    https://doi.org/10.17605/OSF.IO/ERFB7 [Google Scholar]
  64. Vitta, J. P., Nicklin, C., & Albright, S. W.
    (2023b) Academic word difficulty and multidimensional lexical sophistication: An English-for-academic-purposes-focused conceptual replication of Hashimoto and Egbert (2019). Modern Language Journal, 107(1), 373–397. 10.1111/modl.12835
    https://doi.org/10.1111/modl.12835 [Google Scholar]
  65. Vitta, J. P., Nicklin, C., & McLean, S.
    (2022) Effect size–driven sample-size planning, randomization, and multisite use in L2 instructed vocabulary acquisition experimental samples. Studies in Second Language Acquisition, 44(5), 1424–1448. 10.1017/S0272263121000541
    https://doi.org/10.1017/S0272263121000541 [Google Scholar]
  66. Webb, S., & Kagimoto, E.
    (2009) The effects of vocabulary learning on collocation and meaning. TESOL Quarterly, 43(1), 55–77. 10.1002/j.1545‑7249.2009.tb00227.x
    https://doi.org/10.1002/j.1545-7249.2009.tb00227.x [Google Scholar]
  67. Yarkoni, T.
    (2022) The generalizability crisis. Behavioral and Brain Sciences, 451, Article e1. 10.1017/S0140525X20001685
    https://doi.org/10.1017/S0140525X20001685 [Google Scholar]
  68. Yaw, K., Andringa, S., Gass, S., Hancock, G., Isbell, D., Kim, J., Kyoto, M., Larsson, T., Plonsky, L., Sterling, S., & Wood, M.
    (2023) Discussions on the past, present, and future of quantitative research ethics in applied linguistics. Language Teaching, 56(4), 557–561. 10.1017/S0261444823000253
    https://doi.org/10.1017/S0261444823000253 [Google Scholar]
  69. Zhang, X.
    (2020) A bibliometric analysis of second language acquisition between 1997 and 2018. Studies in Second Language Acquisition, 42(1), 199–222. 10.1017/S0272263119000573
    https://doi.org/10.1017/S0272263119000573 [Google Scholar]
/content/journals/10.1075/jsls.00049.vit
Loading
/content/journals/10.1075/jsls.00049.vit
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error