Volume 28, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



Automated tools for syntactic complexity measurement are increasingly used for analyzing various kinds of second language corpora, even though these tools were originally developed and tested for texts produced by advanced learners. This study investigates the reliability of automated complexity measurement for beginner and lower-intermediate L2 English data by comparing manual and automated analyses of a corpus of 80 texts written by Dutch-speaking learners. Our quantitative and qualitative analyses reveal that the reliability of automated complexity measurement is substantially affected by learner errors, parser errors, and pattern undergeneration. We also demonstrate the importance of aligning the definitions of analytical units between the computational tool and human annotators. In order to enhance the reliability of automated analyses, it is recommended that certain modifications are made to the system, and non-advanced L2 English data are preprocessed prior to automated analyses.


Article metrics loading...

Loading full text...

Full text loading...


  1. Bulté, B.
    (2013) The Development of Complexity in Second Language Acquisition: A Dynamic Systems Approach [Unpublished doctoral dissertation]. Vrije Universiteit Brussel.
    [Google Scholar]
  2. Bulté, B., & Housen, A.
    (2012) Defining and operationalising L2 complexity. InA. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA (pp.21–46). John Benjamins. 10.1075/lllt.32.02bul
    https://doi.org/10.1075/lllt.32.02bul [Google Scholar]
  3. (2014) Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing, 261, 42–65. 10.1016/j.jslw.2014.09.005
    https://doi.org/10.1016/j.jslw.2014.09.005 [Google Scholar]
  4. (2018) Syntactic complexity in L2 writing: Individual pathways and emerging group trends. International Journal of Applied Linguistics, 28(1), 147–164. 10.1111/ijal.12196
    https://doi.org/10.1111/ijal.12196 [Google Scholar]
  5. Bulté, B., & Roothooft, H.
    (2020) Investigating the interrelationship between rated L2 proficiency and linguistic complexity in L2 speech. System, 911, 102246. 10.1016/j.system.2020.102246
    https://doi.org/10.1016/j.system.2020.102246 [Google Scholar]
  6. Abney, S. P.
    (1987) The English Noun Phrase in its Sentential Aspect [Doctoral dissertation, Massachusetts Institute of Technology]. DSpace@MIT. dspace.mit.edu/handle/1721.1/14638
    [Google Scholar]
  7. Ai, H., & Lu, X.
    (2013) A corpus-based comparison of syntactic complexity in NNS and NS university students’ writing. InA. Díaz-Negrillo, N. Ballier, & P. Thompson (Eds.), Automatic Treatment and Analysis of Learner Corpus Data (pp.249–264). John Benjamins. 10.1075/scl.59.15ai
    https://doi.org/10.1075/scl.59.15ai [Google Scholar]
  8. Bi, P., & Jiang, J.
    (2020) Syntactic complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and diversity. System, 911, 102248. 10.1016/j.system.2020.102248
    https://doi.org/10.1016/j.system.2020.102248 [Google Scholar]
  9. Biber, D.
    (1988) Variation Across Speech and Writing. Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  10. Cambridge University Press & Assessment
    Cambridge University Press & Assessment. (n.d.). Cambridge learner corpus – error codes. https://www.cambridge.org/sketch/error_codes.html
    [Google Scholar]
  11. Casal, J. E., & Lee, J. J.
    (2019) Syntactic complexity and writing quality in assessed first-year L2 writing. Journal of Second Language Writing, 441, 51–62. 10.1016/j.jslw.2019.03.005
    https://doi.org/10.1016/j.jslw.2019.03.005 [Google Scholar]
  12. Chen, X. B., & Meurers, D.
    (2016) CTAP: A Web-Based Tool Supporting Automatic Complexity Analysis. Proceedings of The Workshop on Computational Linguistics for Linguistic Complexity (pp.113–119). Association for Computational Linguistics. https://www.aclweb.org/anthology/W16-4113.pdf
    [Google Scholar]
  13. Choi, J. D., Tetreault, J., & Stent, A.
    (2015) It depends: Dependency parser comparison using a web-based evaluation tool. InC. Zong & M. Strube (Eds.), Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp.387–396). Association for Computational Linguistics. https://aclanthology.org/P15-1038/. 10.3115/v1/P15‑1038
    https://doi.org/10.3115/v1/P15-1038 [Google Scholar]
  14. Cooper, T. C.
    (1976) Measuring written syntactic patterns of second language learners of German. The Journal of Educational Research, 69(5), 176–183. 10.1080/00220671.1976.10884868
    https://doi.org/10.1080/00220671.1976.10884868 [Google Scholar]
  15. De Clercq, B., & Housen, A.
    (2017) A cross-linguistic perspective on syntactic complexity in L2 Development: Syntactic elaboration and diversity. The Modern Language Journal, 101(2), 315–334. 10.1111/modl.12396
    https://doi.org/10.1111/modl.12396 [Google Scholar]
  16. Gaillat, T., & Ballier, N.
    (2019) Prototype de feedback visuel des productions écrites d’apprenants francophones de l’anglais sous Moodle [Prototype of visual feedback for written productions of French-speaking learners of English on Moodle]. InActes de la conférence EIAH2019. Association des Technologies de l’Information pour l’Education et la Formation. https://hal.archives-ouvertes.fr/hal-02496651/
    [Google Scholar]
  17. Granger, S., Dagneaux, E., Meunier, F., & Paquot, M.
    (2009) International Corpus of Learner English (Version 2.0). Presses universitaires de Louvain.
    [Google Scholar]
  18. Huibregtse, I., Admiraal, W., & Meara, P.
    (2002) Scores on a yes-no vocabulary test: Correction for guessing and response style. Language Testing, 19(3), 227–245. 10.1191/0265532202lt229oa
    https://doi.org/10.1191/0265532202lt229oa [Google Scholar]
  19. Hunt, K. W.
    (1965) Grammatical Structures Written at Three Grade Levels. National Council of Teachers of English.
    [Google Scholar]
  20. Hwang, H., Jung, H., & Kim, H.
    (2020) Effects of written versus spoken production modalities on syntactic complexity measures in beginning-level child EFL learners. The Modern Language Journal, 104(1), 267–283. 10.1111/modl.12626
    https://doi.org/10.1111/modl.12626 [Google Scholar]
  21. Jiang, J., Bi, P., & Liu, H.
    (2019) Syntactic complexity development in the writings of EFL learners: Insights from a dependency syntactically-annotated corpus. Journal of Second Language Writing, 461, 100666. 10.1016/j.jslw.2019.100666
    https://doi.org/10.1016/j.jslw.2019.100666 [Google Scholar]
  22. Kameen, P. T.
    (1979) Syntactic skill and ESL writing quality. InC. Yorio, K. Perkins, & J. Schachter (Eds.), On TESOL ’79: The Learner in Focus (pp.343–364). TESOL.
    [Google Scholar]
  23. Khushik, G. A., & Huhta, A.
    (2019) Investigating syntactic complexity in EFL learners’ writing across Common European Framework of Reference levels A1, A2, and B1. Applied Linguistics, 41(4), 506–532. 10.1093/applin/amy064
    https://doi.org/10.1093/applin/amy064 [Google Scholar]
  24. Klein, D., & Manning, C. D.
    (2003) Fast exact inference with a factored model for natural language parsing. InS. Beker, S. Thrun, & K. Obermayer (Eds.), Advances in Neural Information Processing Systems151 (pp.3–10). MIT Press.
    [Google Scholar]
  25. Kummerfeld, J. K., Hall, D., Curran, J., & Klein, D.
    (2012) Parser showdown at the wall street corral: An empirical investigation of error types in parser output. InJ. Tsujii, J. Henderson, & M. Paşca (Eds.), The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp.1048–1059). https://www.aclweb.org/anthology/D12-1096
    [Google Scholar]
  26. Kyle, K.
    (2016) Measuring Syntactic Development in L2 Writing: Fine Grained Indices of Syntactic Complexity and Usage-based Indices of Syntactic Sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @ Georgia State University. https://scholarworks.gsu.edu/alesl_diss/35
    [Google Scholar]
  27. (2021) Natural language processing for learner corpus research. International Journal of Learner Corpus Research, 7(1), 1–16. 10.1075/ijlcr.00019.int
    https://doi.org/10.1075/ijlcr.00019.int [Google Scholar]
  28. Larsson, T., & Kaatari, H.
    (2020) Syntactic complexity across registers: Investigating (in)formality in second-language writing. Journal of English for Academic Purposes, 451, 100850. 10.1016/j.jeap.2020.100850
    https://doi.org/10.1016/j.jeap.2020.100850 [Google Scholar]
  29. Lu, X.
    (2010) Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496. 10.1075/ijcl.15.4.02lu
    https://doi.org/10.1075/ijcl.15.4.02lu [Google Scholar]
  30. (2011) A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly, 45(1), 36–62. 10.5054/tq.2011.240859
    https://doi.org/10.5054/tq.2011.240859 [Google Scholar]
  31. (2014) Computational Methods for Corpus Annotation and Analysis. Springer. 10.1007/978‑94‑017‑8645‑4
    https://doi.org/10.1007/978-94-017-8645-4 [Google Scholar]
  32. (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Language Testing, 34(4), 493–511. 10.1177/0265532217710675
    https://doi.org/10.1177/0265532217710675 [Google Scholar]
  33. Lu, X., & Ai, H.
    (2015) Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing, 291, 16–27. 10.1016/j.jslw.2015.06.003
    https://doi.org/10.1016/j.jslw.2015.06.003 [Google Scholar]
  34. Lu, X., Casal, J. E., & Liu, Y.
    (2020) The rhetorical functions of syntactically complex sentences in social science research article introductions. Journal of English for Academic Purposes, 441, 100832. 10.1016/j.jeap.2019.100832
    https://doi.org/10.1016/j.jeap.2019.100832 [Google Scholar]
  35. Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky, D.
    (2014) The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp.55–60). Association for Computational Linguistics. nlp.stanford.edu/pubs/StanfordCoreNlp2014.pdf. 10.3115/v1/P14‑5010
    https://doi.org/10.3115/v1/P14-5010 [Google Scholar]
  36. McNamara, D. S., Crossley, S. A., & McCarthy, P. M.
    (2010) Linguistic features of writing quality. Written Communication, 27(1), 57–86. 10.1177/0741088309351547
    https://doi.org/10.1177/0741088309351547 [Google Scholar]
  37. McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z.
    (2014) Automated Evaluation Of Text And Discourse With Coh-metrix. Cambridge University Press. 10.1017/CBO9780511894664
    https://doi.org/10.1017/CBO9780511894664 [Google Scholar]
  38. Meurers, D., & Dickinson, M.
    (2017) Evidence and interpretation in language learning research: Opportunities for collaboration with computational linguistics. Language Learning, 67(S1), 66–95. 10.1111/lang.12233
    https://doi.org/10.1111/lang.12233 [Google Scholar]
  39. Nicholls, D.
    (2003) The Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT. Proceedings of the Corpus Linguistics 2003 conference, 572–581.
    [Google Scholar]
  40. Ortega, L.
    (2003) Syntactic complexity measures and their relationship to L2 Proficiency: A research synthesis of college-level L2 Writing. Applied Linguistics, 24(4), 492–518. 10.1093/applin/24.4.492
    https://doi.org/10.1093/applin/24.4.492 [Google Scholar]
  41. Pallotti, G.
    (2015) A simple view of linguistic complexity. Second Language Research, 31(1), 117–134. 10.1177/0267658314536435
    https://doi.org/10.1177/0267658314536435 [Google Scholar]
  42. Polio, C. G.
    (1997) Measures of linguistic accuracy in second language writing research. Language Learning, 47(1), 101–143. 10.1111/0023‑8333.31997003
    https://doi.org/10.1111/0023-8333.31997003 [Google Scholar]
  43. Polio, C., & Yoon, H.
    (2018) The reliability and validity of automated tools for examining variation in syntactic complexity across genres. International Journal of Applied Linguistics, 28(1), 165–188. 10.1111/ijal.12200
    https://doi.org/10.1111/ijal.12200 [Google Scholar]
  44. R Core Team
    R Core Team (2020) R: A language and environment for statistical computing (Version 4.0.0) [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
    [Google Scholar]
  45. Spoelman, M., & Verspoor, M.
    (2010) Dynamic patterns in development of accuracy and complexity: A longitudinal case study in the acquisition of Finnish. Applied Linguistics, 31(4), 532–553. 10.1093/applin/amq001
    https://doi.org/10.1093/applin/amq001 [Google Scholar]
  46. Stanford Natural Language Processing Group
    Stanford Natural Language Processing Group. (n.d.). Stanford Parser FAQ. RetrievedNovember 10, 2020, fromhttps://nlp.stanford.edu/software/parser-faq.html#corenlpdiff
    [Google Scholar]
  47. Verspoor, M., de Bot, C., & Xu, X.
    (2015) The effects of English bilingual education in the Netherlands. Journal of Immersion and Content-Based Language Education, 3(1), 4–27. 10.1075/jicb.3.1.01ver
    https://doi.org/10.1075/jicb.3.1.01ver [Google Scholar]
  48. Verspoor, M., Schmid, M. S., & Xu, X.
    (2012) A dynamic usage based perspective on L2 writing. Journal of Second Language Writing, 21(3), 239–263. 10.1016/j.jslw.2012.03.007
    https://doi.org/10.1016/j.jslw.2012.03.007 [Google Scholar]
  49. Vyatkina, N.
    (2013) Specific syntactic complexity: Developmental profiling of individuals based on an annotated learner corpus. The Modern Language Journal, 97(S1), 11–30. 10.1111/j.1540‑4781.2012.01421.x
    https://doi.org/10.1111/j.1540-4781.2012.01421.x [Google Scholar]
  50. Walter, T.
    (2017) Measuring Syntactic Complexity in the Academic Writing of English Students at the University of Vienna. [Doctoral dissertation, Universität Wien]. u:thesesUniversität Wien. 10.25365/thesis.50395
    https://doi.org/10.25365/thesis.50395 [Google Scholar]
  51. Wu, X., Mauranen, A., & Lei, L.
    (2020) Syntactic complexity in English as a Lingua Franca academic writing. Journal of English for Academic Purposes, 431, 100798. 10.1016/j.jeap.2019.100798
    https://doi.org/10.1016/j.jeap.2019.100798 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error