Volume 172, Issue 1
  • ISSN 0019-0829
  • E-ISSN: 1783-1490
Buy:$35.00 + Taxes



The average quantitative research report in applied linguistics is needlessly complicated. Articles with over fifty hypothesis tests are no exception, but despite such an onslaught of numbers, the patterns in the data often remain opaque to readers well-versed in quantitative methods, not to mention to colleagues, students, and non-academics without years of experience in navigating results sections. I offer five suggestions for increasing both the transparency and the simplicity of quantitative research reports: (1) round numbers, (2) draw more graphs, (3) run and report fewer significance tests, (4) report simple rather than complex analyses when they yield essentially the same results, and (5) use online appendices liberally to document secondary analyses and share code and data.


Article metrics loading...

Loading full text...

Full text loading...


  1. Abelson, R. P.
    (1995) Statistics as principled argument. New York, NY: Psychology Press.
    [Google Scholar]
  2. Allen, M. , Poggiali, D. , Whitaker, K. , Marshall, T. R. , & Kievit, R. A.
    (2019) Raincloud plots: A multi-platform tool for robust data visualization. Wellcome Open Research, 4, 63. doi:  10.12688/wellcomeopenres.15191.1
    https://doi.org/10.12688/wellcomeopenres.15191.1 [Google Scholar]
  3. Anscombe, F. J.
    (1973) Graphs in statistical analysis. The American Statistician, 27(1), 17–21. doi:  10.2307/2682899
    https://doi.org/10.2307/2682899 [Google Scholar]
  4. Anwyl-Irvine, A. , Dalmaijer, E. S. , Hodges, N. , & Evershed, J. K.
    (2020) Online participants in the wild: Realistic precision & accuracy of platforms, web-browsers, and devices. PsyArxiv Preprints. doi:  10.31234/osf.io/jfeca
    https://doi.org/10.31234/osf.io/jfeca [Google Scholar]
  5. Baayen, R. H.
    (2010) A real experiment is a factorial experiment?The Mental Lexicon, 5(1), 149–157. doi:  10.1075/ml.5.1.06baa
    https://doi.org/10.1075/ml.5.1.06baa [Google Scholar]
  6. Baguley, T.
    (2009) Standardized or simple effect size: What should be reported?British Journal of Psychology, 100(3), 603–617. doi:  10.1348/000712608X377117
    https://doi.org/10.1348/000712608X377117 [Google Scholar]
  7. Bender, R. , & Lange, S.
    (2001) Adjusting for multiple testing: When and how?Journal of Clinical Epidemiology, 54(4), 343–349. doi:  10.1016/S0895‑4356(00)00314‑0
    https://doi.org/10.1016/S0895-4356(00)00314-0 [Google Scholar]
  8. Bridges, D. , Pitiot, A. , MacAskill, M. , & Peirce, J.
    (2020) The timing mega-study: Comparing a range of experiment generators, both lab-based and online. PsyArxiv Preprints. doi:  10.31234/osf.io/d6nu5
    https://doi.org/10.31234/osf.io/d6nu5 [Google Scholar]
  9. Chambers, C.
    (2017) The seven deadly sins of psychology: A manifesto for reforming the culture of scientific practice. Princeton, NJ: Princeton University Press.
    [Google Scholar]
  10. Chatfield, C.
    (1983) Statistics for technology: A course in applied statistics (3rd ed.). Boca Raton, FL: Chapman & Hall/CRC.
    [Google Scholar]
  11. Clark, M.
    (2019) Generalized additive models. Retrieved fromhttps://m-clark.github.io/generalized-additive-models/
  12. Cohen, J.
    (1983) The cost of dichotomization. Applied Psychological Measurement, 7, 249–253. doi:  10.1177/014662168300700301
    https://doi.org/10.1177/014662168300700301 [Google Scholar]
  13. (1994) The Earth is round (p<.05). American Psychologist, 49, 997–1003. doi:  10.1037/0003‑066X.49.12.997
    https://doi.org/10.1037/0003-066X.49.12.997 [Google Scholar]
  14. Cramer, A. O. J. , van Ravenzwaaij, D. , Matzke, D. , Steingroever, H. , Wetzels, R. , Grasman, R. P. P .P. , … Wagenmakers, E. -J.
    (2016) Hidden multiplicity in exploratory multiway ANOVA: Prevalence and remedies. Psychonomic Bulletin & Review, 23(2), 640–647. doi:  10.3758/s13423‑015‑0913‑5
    https://doi.org/10.3758/s13423-015-0913-5 [Google Scholar]
  15. de Groot, A. D.
    (2014) The meaning of “significance” for different types of research. Acta Psychologica, 148, 188–194. doi:  10.1016/j.actpsy.2014.02.001
    https://doi.org/10.1016/j.actpsy.2014.02.001 [Google Scholar]
  16. Delacre, M. , Lakens, D. , & Leys, C.
    (2017) Why psychologists should by default use Welch’s t-test instead of Student’s t-test. International Review of Social Psychology, 30(1), 92–101. doi:  10.5334/irsp.82
    https://doi.org/10.5334/irsp.82 [Google Scholar]
  17. Delacre, M. , Leys, C. , Mora, Y. L. , & Lakens, D.
    (2019) Taking parametric assumptions seriously: Arguments for the use of Welch’s F-test instead of the classical F-test in one-way ANOVA. International Review of Social Psychology, 32(1), 13. doi:  10.5334/irsp.198
    https://doi.org/10.5334/irsp.198 [Google Scholar]
  18. Ehrenberg, A. S. C.
    (1977) Rudiments of numeracy. Journal of the Royal Statistical Society. Series A (General), 140(3), 277–297. doi:  10.2307/2344922
    https://doi.org/10.2307/2344922 [Google Scholar]
  19. (1981) The problem of numeracy. The American Statistician, 35(2), 67–71. doi:  10.2307/2683143
    https://doi.org/10.2307/2683143 [Google Scholar]
  20. Elwert, F.
    (2013) Graphical causal models. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp.245–273). Dordrecht, The Netherlands: Springer. doi:  10.1007/978‑94‑007‑6094‑3_13
    https://doi.org/10.1007/978-94-007-6094-3_13 [Google Scholar]
  21. Emerson, J. W. , Green, W. A. , Schloerke, B. , Crowley, J. , Cook, D. , Hofmann, H. , & Wickham, H.
    (2013) The generalized pairs plot. Journal of Computational and Graphical Statistics, 22(1), 79–91. doi:  10.1080/10618600.2012.694762
    https://doi.org/10.1080/10618600.2012.694762 [Google Scholar]
  22. Feinberg, R. A. , & Wainer, H.
    (2011) Extracting sunbeams from cucumbers. Journal of Computational and Graphical Statistics, 20(4), 793–810. doi:  10.1198/jcgs.2011.204a
    https://doi.org/10.1198/jcgs.2011.204a [Google Scholar]
  23. Fox, J.
    (2003) Effect displays in R for generalised linear models. Journal of Statistical Software, 8, 1–27. doi:  10.18637/jss.v008.i15
    https://doi.org/10.18637/jss.v008.i15 [Google Scholar]
  24. Gelman, A. , & Hill, J.
    (2007) Data analysis using regression and multilevel/hierarchical models. New York, NY: Cambridge University Press.
    [Google Scholar]
  25. Gelman, A. , & Loken, E.
    (2013) The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Retrieved fromwww.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf
  26. Gigerenzer, G. , & Marewski, J. M.
    (2015) Surrogate science: The idol of a universal method for scientific inference. Journal of Management, 41(2), 421–440. doi:  10.1177/0149206314547522
    https://doi.org/10.1177/0149206314547522 [Google Scholar]
  27. Goodman, S.
    (2008) A dirty dozen: Twelve p-value misconceptions. Seminars in Hematology, 45, 135–140. doi:  10.1053/j.seminhematol.2008.04.003
    https://doi.org/10.1053/j.seminhematol.2008.04.003 [Google Scholar]
  28. Greenland, S. , Senn, S. J. , Rothman, K. J. , Carlin, J. B. , Poole, C. , Goodman, S. N. , & Altman, D. G.
    (2016) Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31, 337–350. doi:  10.1007/s10654‑016‑0149‑3
    https://doi.org/10.1007/s10654-016-0149-3 [Google Scholar]
  29. Healy, K.
    (2019) Data visualization: A practical introduction. Princeton, NJ: Princeton University Press.
    [Google Scholar]
  30. Hendrix, L. J. , Carter, M. W. , & Hintze, J. L.
    (1978) A comparison of five statistical methods for analyzing pretest-posttest designs. Journal of Experimental Education, 47(2), 96–102. doi:  10.1080/00220973.1978.11011664
    https://doi.org/10.1080/00220973.1978.11011664 [Google Scholar]
  31. Hesterberg, T. C.
    (2015) What teachers should know about the bootstrap: Resampling in the undergraduate statistics curriculum. The American Statistician, 69(4), 371–386. doi:  10.1080/00031305.2015.1089789
    https://doi.org/10.1080/00031305.2015.1089789 [Google Scholar]
  32. Huck, S. W. , & McLean, R. A.
    (1975) Using a repeated measures ANOVA to analyze the data from a pretest-posttest design: A potentially confusing task. Psychological Bulletin, 82(4), 511–518. doi:  10.1037/h0076767
    https://doi.org/10.1037/h0076767 [Google Scholar]
  33. Huitema, B. E.
    (2011) The analysis of covariance and alternatives: Statistical methods for experiments, quasi-experiments, and single-case studies. Hoboken, NJ: Wiley. 10.1002/9781118067475
    https://doi.org/10.1002/9781118067475 [Google Scholar]
  34. Hünermund, P. , & Louw, B.
    (2020) On the nuisance of control variables in regression analysis. https://arxiv.org/abs/2005.10314
  35. Jacoby, W. G.
    (2006) The dot plot: A graphical display for labeled quantitative values. The Political Methodologist, 14(1), 6–14.
    [Google Scholar]
  36. Kerr, N. L.
    (1998) HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196–217. doi: 10.1207/s15327957pspr0203\_4
    https://doi.org/10.1207/s15327957pspr0203\_4 [Google Scholar]
  37. Klein, O. , Hardwicke, T. E. , Aust, F. , Breuer, J. , Danielsson, H. , Hofelich Mohr, A. , … Frank, M. C.
    (2018) A practical guide for transparency in psychological science. Collabra: Psychology, 4(1), 20. doi:  10.1525/collabra.158
    https://doi.org/10.1525/collabra.158 [Google Scholar]
  38. Krashen, S.
    (2012) A short paper proposing that we need to write shorter papers. Language and Language Teaching, 1(2), 38–39.
    [Google Scholar]
  39. Larson-Hall, J. , & Plonsky, L.
    (2015) Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65(s1), 127–159. doi:  10.1111/lang.12115
    https://doi.org/10.1111/lang.12115 [Google Scholar]
  40. Loewen, S. , Gönülal, T. , Isbell, D. R. , Ballard, L. , Crowther, D. , Lim, J. , … Tigchelaar, M.
    (2019) How knowledgeable are applied linguistics and SLA researchers about basic statistics?: Data from North America and Europe. Studies in Second Language Acquisition. doi:  10.1017/S0272263119000548
    https://doi.org/10.1017/S0272263119000548 [Google Scholar]
  41. MacCallum, R. C. , Zhang, S. , Preacher, K. J. , & Rucker, D. D.
    (2002) On the practice of dichotomization of quantitative variables. Psychological Methods, 7(1), 19–40. doi:  10.1037/1082‑989X.7.1.19
    https://doi.org/10.1037/1082-989X.7.1.19 [Google Scholar]
  42. Maris, E.
    (1998) Covariance adjustment versus gain scores – revisited. Psychological Methods, 3(3), 309–327. doi:  10.1037/1082‑989X.3.3.309
    https://doi.org/10.1037/1082-989X.3.3.309 [Google Scholar]
  43. Maxwell, S. E. , & Delaney, H. D.
    (1993) Bivariate median splits and spurious statistical significance. Psychological Bulletin, 113(1), 181–190. doi:  10.1037/0033‑2909.113.1.181
    https://doi.org/10.1037/0033-2909.113.1.181 [Google Scholar]
  44. Maxwell, S. E. , Delaney, H. , & Hill, C. A.
    (1984) Another look at ANCOVA versus blocking. Psychological Bulletin, 95(1), 136–147. doi:  10.1037/0033‑2909.95.1.136
    https://doi.org/10.1037/0033-2909.95.1.136 [Google Scholar]
  45. McAweeney, M. J. , & Klockars, A. J.
    (1998) Maximizing power in skewed distributions: Analysis and assignment. Psychological Methods, 3(1), 117–122. doi:  10.1037/1082‑989X.3.1.117
    https://doi.org/10.1037/1082-989X.3.1.117 [Google Scholar]
  46. Murtaugh, P. A.
    (2007) Simplicity and complexity in ecological data analysis. Ecology, 88(1), 56–62. doi:  10.1890/0012‑9658(2007)88[56:SACIED]2.0.CO;2
    https://doi.org/10.1890/0012-9658(2007)88[56:SACIED]2.0.CO;2 [Google Scholar]
  47. Mutz, D. C. , Pemantle, R. , & Pham, P.
    (2019) The perils of balance testing in experimental design: Messy analyses of clean data. The American Statistician, 73(1), 32–42. doi:  10.1080/00031305.2017.1322143
    https://doi.org/10.1080/00031305.2017.1322143 [Google Scholar]
  48. Robbins, N. B.
    (2005) Creating more effective graphs. Hoboken, NJ: Wiley.
    [Google Scholar]
  49. Rohrer, J. M.
    (2018) Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in Methods and Practices in Psychological Science, 1(1), 27–42. doi:  10.1177/2515245917745629
    https://doi.org/10.1177/2515245917745629 [Google Scholar]
  50. Rubin, M.
    (2017) Do p values lose their meaning in exploratory analyses? It depends how you define the familywise error rate. Review of General Psychology, 21(3), 269–275. doi:  10.1037/gpr0000123
    https://doi.org/10.1037/gpr0000123 [Google Scholar]
  51. Ruxton, G. D. , & Beauchamp, G.
    (2008) Time for some a priori thinking about post hoc testing. Behavioral Ecology, 19(3), 690–693. doi:  10.1093/beheco/arn020
    https://doi.org/10.1093/beheco/arn020 [Google Scholar]
  52. Sassenhagen, J. , & Alday, P. M.
    (2016) A common misapplication of statistical inference: Nuisance control with null-hypothesis significance tests. Brain and Language, 162, 42–45. doi:  10.1016/j.bandl.2016.08.001
    https://doi.org/10.1016/j.bandl.2016.08.001 [Google Scholar]
  53. Schad, D. J. , Vasishth, S. , Hohenstein, S. , & Kliegl, R.
    (2020) How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language, 110. doi:  10.1016/j.jml.2019.104038
    https://doi.org/10.1016/j.jml.2019.104038 [Google Scholar]
  54. Schmider, E. , Ziegler, M. , Danay, E. , Beyer, L. , & Bühner, M.
    (2010) Is it really robust? Reinvestigating the robustness of anova against violations of the normal distribution assumption. Methodology, 6, 147–151. doi:  10.1027/1614‑2241/a000016
    https://doi.org/10.1027/1614-2241/a000016 [Google Scholar]
  55. Senn, S.
    (2012) Seven myths of randomisation in clinical trials. Statistics in Medicine, 32, 1439–1450. doi:  10.1002/sim.5713
    https://doi.org/10.1002/sim.5713 [Google Scholar]
  56. Sönning, L.
    (2016) The dot plot: A graphical tool for data analysis and presentation. In H. Christ , D. Klenovšak , L. Sönning , & V. Werner (Eds.), A blend of MaLT: Selected contributions from the Methods and Linguistic Theories Symposium 2015 (pp.101–129). Bamberg, Germany: University of Bamberg Press. doi: 10.20378/irbo‑51101
    https://doi.org/10.20378/irbo-51101 [Google Scholar]
  57. Steegen, S. , Tuerlinckx, F. , Gelman, A. , & Vanpaemel, W.
    (2016) Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11(5), 702–712. doi:  10.1177/1745691616658637
    https://doi.org/10.1177/1745691616658637 [Google Scholar]
  58. Tukey, J. W.
    (1969) Analyzing data: Sanctification or detective work?American Psychologist, 24, 83–91. doi:  10.1037/h0027108
    https://doi.org/10.1037/h0027108 [Google Scholar]
  59. Vanhove, J.
    (2015) Analyzing randomized controlled interventions: Three notes for applied linguists. Studies in Second Language Learning and Teaching, 5, 135–152. doi:  10.14746/ssllt.2015.5.1.7
    https://doi.org/10.14746/ssllt.2015.5.1.7 [Google Scholar]
  60. (2019a) Visualising statistical uncertainty using model-based graphs. Presentation at the 8th Biennial International Conference on the Linguistics of Contemporary English, Bamberg, Germany. Retrieved fromhttps://janhove.github.io/visualise_uncertainty/
  61. (2019b) cannonball: Tools for teaching statistics. R package, version 0.1.0. Available fromhttps://github.com/janhove/cannonball
    [Google Scholar]
  62. (2020) Collinearity isn’t a disease that needs curing. PsyArXiv Preprints. doi:  10.31234/osf.io/mv2wx
    https://doi.org/10.31234/osf.io/mv2wx [Google Scholar]
  63. Wainer, H.
    (1992) Understanding graphs and tables. Educational Researchers, 21(1), 14–23. doi:  10.3102/0013189X021001014
    https://doi.org/10.3102/0013189X021001014 [Google Scholar]
  64. Weissgerber, T. L. , Milic, N. M. , Winham, S. J. , & Garovic, V. D.
    (2015) Beyond bar and line graphs: Time for a new data presentation paradigm. PLOS Biology, 13(4), e1002128. doi:  10.1371/journal.pbio.1002128
    https://doi.org/10.1371/journal.pbio.1002128 [Google Scholar]
  65. Wilke, C. O.
    (2019) Fundamentals of data visualization: A primer on making informative and compelling figures. Sebastopol, CA: O’Reilly.
    [Google Scholar]
  66. Zimmerman, D. W.
    (1998) Invalidation of parametric and nonparametric statistical tests by concurrent violation of two assumptions. Journal of Experimental Education, 67(1), 55–68. doi:  10.1080/00220979809598344
    https://doi.org/10.1080/00220979809598344 [Google Scholar]
  67. Zuur, A. F. , Ieno, E. N. , Walker, N. J. , Saveliev, A. A. , & Smith, G. M.
    (2009) Mixed effects models and extensions in ecology with R.New York, NY: Springer. 10.1007/978‑0‑387‑87458‑6
    https://doi.org/10.1007/978-0-387-87458-6 [Google Scholar]
  68. Ågren, M. , & van de Weijer, J.
    (2019) The production of preverbal liaison in Swedish learners of L2 French. Language, Interaction and Acquisition, 10(1), 117–139. doi:  10.1075/lia.17023.agr
    https://doi.org/10.1075/lia.17023.agr [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error