1887
Volume 7, Issue 1
  • ISSN 2215-1478
  • E-ISSN: 2215-1486
USD
Buy:$35.00 + Taxes

Abstract

Abstract

While traditionally linguistic complexity analysis of learner language is mostly based on essays, there is increasing interest in other task types. This is crucial for obtaining a broader empirical basis for characterizing language proficiency and highlights the need to advance our understanding of how task and learner properties interact in shaping the linguistic complexity of learner productions. It also makes it important to determine which complexity measures generalize well across which tasks.

In this paper, we investigate the linguistic complexity of answers to reading comprehension questions written by foreign language learners of German at the college level. Analyzing the corpus with computational linguistic methods identifying a wide range of complexity features, we explore which linguistic complexity analyses can successfully be performed for such short answers, how learner proficiency impacts the results, how generalizable they are across different contexts, and how the quality of the underlying analysis impacts the results.

Loading

Article metrics loading...

/content/journals/10.1075/ijlcr.20006.wei
2021-03-01
2025-02-09
Loading full text...

Full text loading...

References

  1. Alexopoulou, T. , Michel, M. , Murakami, A. , & Meurers, D.
    (2017) Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning, 67, 181–209. 10.1111/lang.12232
    https://doi.org/10.1111/lang.12232 [Google Scholar]
  2. Biber, D. , Gray, B. , & Staples, S.
    (2016) Predicting patterns of grammatical complexity across language exam task types and proficiency levels. Applied Linguistics, 37(5), 639–668. 10.1093/applin/amu059
    https://doi.org/10.1093/applin/amu059 [Google Scholar]
  3. Björkelund, A. , Bohnet, B. , Hafdell, L. , & Nugues, P.
    (2010) A high-performance syntactic and semantic dependency parser. InDemonstration volume of the 23rd COLING (pp.23–27). Beijing.
    [Google Scholar]
  4. Bohnet, B. , & Nivre, J.
    (2012) A transition-based system for joint part-of-speech tagging and labeled non-projective dependency parsing. InProceedings of the 2012 joint conference on EMNLP and computational natural language learning (pp.1455–1465). Jeju Island, Korea: Association for Computational Linguistics.
    [Google Scholar]
  5. Brants, S. , Dipper, S. , Hansen, S. , Lezius, W. , & Smith, G.
    (2002) The TIGER treebank. InProceedings of the workshop on treebanks and linguistic theories. Sozopol.
    [Google Scholar]
  6. Brants, T. , Skut, W. , & Uszkoreit, H.
    (1999) Syntactic annotation of a German newspaper corpus. InProceedings of the ATALA treebank workshop. Paris.
    [Google Scholar]
  7. Brezina, V. , & Pallotti, G.
    (2019) Morphological complexity in written l2 texts. Second Language Research, 35(1), 99–119. 10.1177/0267658316643125
    https://doi.org/10.1177/0267658316643125 [Google Scholar]
  8. Brown, C. , Snodgrass, T. , Kemper, S. J. , Herman, R. , & Covington, M. A.
    (2008) Automatic measurement of propositional idea density from part-of-speech tagging. Behavior Research Methods, 40(2), 540–545. 10.3758/BRM.40.2.540
    https://doi.org/10.3758/BRM.40.2.540 [Google Scholar]
  9. Brysbaert, M. , Buchmeier, M. , Conrad, M. , Jacobs, A. M. , Bölte, J. , & Böhl, A.
    (2011) The word frequency effect: A review of recent developments and implications for the choice of frequency estimates in German. Experimental Psychology, 58, 412–424. doi:  10.1027/1618‑3169/a000123
    https://doi.org/10.1027/1618-3169/a000123 [Google Scholar]
  10. Caines, A. , & Buttery, P.
    (2017) The effect of task and topic on opportunity of use in learner corpora. InLearner corpus research: New perspectives and applications. London: Bloomsbury.
    [Google Scholar]
  11. Chen, D. , & Manning, C.
    (2014) A fast and accurate dependency parser using neural networks. InProceedings of the 2014 conference on EMNLP (pp.740–750). Doha, Qatar.
    [Google Scholar]
  12. Crossley, S. A.
    (2020) Linguistic features in writing quality and development: An overview. Journal of Writing Research, 11(3), 415–443. 10.17239/jowr‑2020.11.03.01
    https://doi.org/10.17239/jowr-2020.11.03.01 [Google Scholar]
  13. Crossley, S. A. , Skalicky, S. , & Dascalu, M.
    (2019) Moving beyond classic readability formulas: new methods and new models. Journal of Research in Reading, 42(3–4), 541–561. 10.1111/1467‑9817.12283
    https://doi.org/10.1111/1467-9817.12283 [Google Scholar]
  14. Crossley, S. A. , Weston, J. L. , Sullivan, S. T. M. , & McNamara, D. S.
    (2011) The development of writing proficiency as a function of grade level: A linguistic analysis. Written Communication, 28(3), 282–311. 10.1177/0741088311410188
    https://doi.org/10.1177/0741088311410188 [Google Scholar]
  15. De Clercq, B. , & Housen, A.
    (2019) The development of morphological complexity: A cross-linguistic study of L2 French and English. Second Language Research Special Issue on Linguistic Complexity, 35(1), 71–97.
    [Google Scholar]
  16. Dell’Orletta, F. , Montemagni, S. , & Venturi, G.
    (2014) Assessing document and sentence readability in less resourced languages and across textual genres. Recent Advances in Automatic Readability Assessment and Text Simplification. Special issue of the International Journal of Applied Linguistics, 165(2), 163–193.
    [Google Scholar]
  17. Díaz-Negrillo, A. , Meurers, D. , Valera, S. , & Wunsch, H.
    (2010) Towards interlanguage POS annotation for effective learner corpora in SLA and FLT. Language Forum, 36(1–2), 139–154.
    [Google Scholar]
  18. Duden
    Duden (2009) Deutsche Grammatik (4th ed., Vol.4). Dudenverlag.
    [Google Scholar]
  19. Ellis, N. C.
    (2002) Frequency effecs in language processing. A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143–188. 10.1017/S0272263102002024
    https://doi.org/10.1017/S0272263102002024 [Google Scholar]
  20. Ellis, R.
    (2003) Task-based language learning and teaching. Oxford, UK: Oxford University Press.
    [Google Scholar]
  21. François, T. , & Fairon, C.
    (2012) An “AI readability” formula for French as a foreign language. InProceedings of the 2012 joint conference on EMNLP and computational natural language learning.
    [Google Scholar]
  22. Galasso, S.
    (2014) Exploring textual cohesion characteristics for German readability classification (Bachelor Thesis in Computational Linguistics). Department of Linguistics, University of Tübingen. (purl.org/dm/papers/Galasso-14.pdf)
  23. Geertzen, J. , Alexopoulou, T. , & Korhonen, A.
    (2013) Automatic linguistic annotation of large scale L2 databases: The EF-Cambridge open language database (EFCAMDAT). InProceedings of the 31st SLRF. Cascadilla Press.
    [Google Scholar]
  24. Gibson, E.
    (2000) The dependency locality theory: A distance-based theory of linguistic complexity. In A. Marantz , Y. Miyashita , & W. O’Neil (Eds.), Image, language, brain: papers from the first mind articulation project symposium (pp.95–126). MIT.
    [Google Scholar]
  25. Goldhahn, D. , Eckart, T. , & Quasthoff, U.
    (2012) Building large monolingual dictionaries at the leipzig corpora collection: From 100 to 200 languages. Proceedings of the 8th International Language Ressources and Evaluation, 759–765.
    [Google Scholar]
  26. Hamp, B. , & Feldweg, H.
    (1997) GermaNet – a lexical-semantic net for German. InProceedings of ACL workshop automatic information extraction and building of lexical semantic resources for NLP applications. Madrid.
    [Google Scholar]
  27. Hancke, J.
    (2013) Automatic prediction of CEFR proficiency levels based on linguistic features of learner language (Unpublished master’s thesis). Department of Linguistics, University of Tübingen.
    [Google Scholar]
  28. Hancke, J. , Vajjala, S. , & Meurers, D.
    (2012) Readability classification for German using lexical, syntactic, and morphological features. InProceedings of the 24th COLING (pp.1063–1080). Mumbay, India.
    [Google Scholar]
  29. Heister, J. , Würzner, K.-M. , Bubenzer, J. , Pohl, E. , Hanneforth, T. , Geyken, A. , & Kliegl, R.
    (2011) dlexDB – eine lexikalische Datenbank für die psychologische und linguistische Forschung. Psychologische Rundschau, 62, 10–20. 10.1026/0033‑3042/a000029
    https://doi.org/10.1026/0033-3042/a000029 [Google Scholar]
  30. Höhle, T. N.
    (1986) Der Begriff ‘Mittelfeld’. Anmerkungen über die Theorie der topologischen Felder. In A. Schöne (Ed.), Kontroversen alte und neue. Akten des VII. Internationalen Germanistenkongresses Göttingen 1985 (pp.329–340). Tübingen: Niemeyer. (Bd. 3)
    [Google Scholar]
  31. Housen, A. , De Clercq, B. , Kuiken, F. , & Vedder, I.
    (2019) Multiple approaches to complexity in second language research. Second Language Research. Special Issue on Linguistic Complexity, 35(1), 2–31.
    [Google Scholar]
  32. Housen, A. , & Kuiken, F.
    (2009) Complexity, accuracy and fluency in second language acquisition. Applied Linguistics, 30(4), 461–473. 10.1093/applin/amp048
    https://doi.org/10.1093/applin/amp048 [Google Scholar]
  33. Housen, A. , Kuiken, F. , & Vedder, I.
    (2012) Complexity, accuracy and fluency: Definitions, measurement and research. In A. Housen , F. Kuiken , & I. Vedder (Eds.), Dimensions of L2 performance and proficiency (pp.1–20). John Benjamins. 10.1075/lllt.32.01hou
    https://doi.org/10.1075/lllt.32.01hou [Google Scholar]
  34. Hunt, K. W.
    (1965) A synopsis of clause-to-sentence length factors. The English Journal, 54(4), 300+305-309. 10.2307/811114
    https://doi.org/10.2307/811114 [Google Scholar]
  35. Lavalley, R. , Berkling, K. , & Stüker, S.
    (2015) Preparing children’s writing database for automated processing. InProceedings of the workshop on language teaching, learning and technology at speech and language technologies in education (pp.9–15).
    [Google Scholar]
  36. Lüdeling, A.
    (2008) Mehrdeutigkeiten und Kategorisierung: Probleme bei der Annotation von Lernerkorpora. In M. Walter & P. Grommes (Eds.), Fortgeschrittene Lernervarietäten: Korpuslinguistik und Zweispracherwerbsforschung (pp.119–140). Tübingen: Max Niemeyer Verlag.
    [Google Scholar]
  37. Lüdeling, A. , Walter, M. , Kroymann, E. , & Adolphs, P.
    (2005) Multi-level error annotation in learner corpora. InProceedings of corpus linguistics. Birmingham.
    [Google Scholar]
  38. McCarthy, P. M.
    (2005) An assessment of the range and usefulness of lexical diversity measures and the potential of the measure of textual, lexical diversity (MTLD) (Unpublished doctoral dissertation). University of Memphis.
    [Google Scholar]
  39. McCarthy, P. M. , & Jarvis, S.
    (2010) MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. doi:  10.3758/BRM.42.2.381
    https://doi.org/10.3758/BRM.42.2.381 [Google Scholar]
  40. Meurers, D.
    (2005) On the use of electronic corpora for theoretical linguistics. case studies from the syntax of German. Lingua, 115(11), 1619–1639. 10.1016/j.lingua.2004.07.007
    https://doi.org/10.1016/j.lingua.2004.07.007 [Google Scholar]
  41. (2015) Learner corpora and natural language processing. In S. Granger , G. Gilquin , & F. Meunier (Eds.), The cambridge handbook of learner corpus research (pp.537–566). Cambridge University Press. 10.1017/CBO9781139649414.024
    https://doi.org/10.1017/CBO9781139649414.024 [Google Scholar]
  42. (2020) Natural language processing and language learning. In C. A. Chapelle (Ed.), The concise encyclopedia of applied linguistics (pp.817–831). Oxford: Wiley.
    [Google Scholar]
  43. Meurers, D. , & Dickinson, M.
    (2017) Evidence and interpretation in language learning research: Opportunities for collaboration with computational linguistics. Language Learning, 67(2). 10.1111/lang.12233
    https://doi.org/10.1111/lang.12233 [Google Scholar]
  44. Michel, M. , Murakami, A. , Alexopoulou, T. , & Meurers, D.
    (2019) Effects of task type on morphosyntactic complexity across proficiency: Evidence from a large learner corpus of A1 to C2 writings. Instructed Second Language Acquisition, 3, 124–152. 10.1558/isla.38248
    https://doi.org/10.1558/isla.38248 [Google Scholar]
  45. Ott, N. , & Ziai, R.
    (2010) Evaluating dependency parsing performance on German learner language. In M. Dickinson , K. Müürisep , & M. Passarotti (Eds.), Proceedings of the ninth international workshop on treebanks and linguistic theories (Vol.9, pp.175–186). Tartu, Estonia: Tartu University Press. hdl.handle.net/10062/15960
    [Google Scholar]
  46. Ott, N. , Ziai, R. , & Meurers, D.
    (2012) Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in context. In T. Schmidt & K. Wörner (Eds.), Multilingual corpora and multilingual corpus analysis (pp.47–69). Amsterdam: Benjamins. 10.1075/hsm.14.05ott
    https://doi.org/10.1075/hsm.14.05ott [Google Scholar]
  47. Petrov, S. , & Klein, D.
    (2007) Improved inference for unlexicalized parsing. InProceedings of the NAACL main conference (pp.404–411). Rochester, New York.
    [Google Scholar]
  48. Pilán, I. , Vajjala, S. , & Volodina, E.
    (2015) A readable read: Automatic assessment of language learning materials based on linguistic complexity. InProceedings of CICLING 2015.
    [Google Scholar]
  49. Reis, M.
    (2001) Bilden Modalverben im Deutschen eine syntaktische Klasse?In R. Müller & M. Reis (Eds.), Modalität und Modalverben im Deutschen. Hamburg: Helmut Buske. (Linguistische Berichte – Sonderhefte)
    [Google Scholar]
  50. Seeker, W. , & Kuhn, J.
    (2012) Making ellipses explicit in dependency conversion for a German treebank. InProceedings of the 8th international conference on language resources and evaluation (pp.3132–3139). Istanbul, Turkey.
    [Google Scholar]
  51. Shain, C. , van Schijndel, M. , Futrell, R. , Gibson, E. , & Schuler, W.
    (2016) Memory access during incremental sentence processing causes reading time latency. InProceedings of the workshop on computational linguistics for linguistic complexity (p.49–58). Osaka.
    [Google Scholar]
  52. Staples, S. , Egbert, J. , Biber, D. , & Gray, B.
    (2016) Academic writing development at the university level: Phrasal and clausal complexity across level of study, discipline, and genre. Written Communication, 33(2), 149–183. 10.1177/0741088316631527
    https://doi.org/10.1177/0741088316631527 [Google Scholar]
  53. Tagliamonte, S. A.
    (2011) Variationist sociolinguistics: Change, observation, interpretation. John Wiley & Sons.
    [Google Scholar]
  54. Telljohann, H. , Hinrichs, E. , & Kübler, S.
    (2004) The TüBa-D/Z treebank: Annotating German with a context-free backbone. InProceedings of the fourth LREC. Lissabon.
    [Google Scholar]
  55. Tharwat, A.
    (2018) Classification assessment methods. Applied Computing and Informatics.
    [Google Scholar]
  56. Thielen, C. , Schiller, A. , Teufel, S. , & Stöckert, C.
    (1999) Guidelines für das Tagging deutscher Textkorpora mit STTS (Tech. Rep.). Stuttgart/Tübingen: Institut für Maschinelle Sprachverarbeitung Stuttgart and Seminar für Sprachwissenschaft Tübingen.
    [Google Scholar]
  57. Tracy-Ventura, N. , & Myles, F.
    (2015) The importance of task variability in the design of learner corpora for SLA research. International Journal of Learner Corpus Research, 1(1), 58–95. 10.1075/ijlcr.1.1.03tra
    https://doi.org/10.1075/ijlcr.1.1.03tra [Google Scholar]
  58. Vajjala, S. , & Meurers, D.
    (2012) On improving the accuracy of readability classification using insights from second language acquisition. InProceedings of the seventh BEA workshop (pp.163–173).
    [Google Scholar]
  59. Weiss, Z.
    (2015) More linguistically motivated features of language complexity in readability classification of German textbooks: Implementation and evaluation (Bachelor’s Thesis). Department of Linguistics, University of Tübingen. (purl.org/zweiss/rsrc/Weiss-15-BA-CL.pdf)
  60. (2017) Using measures of linguistic complexity to assess German L2 proficiency in learner corpora under consideration of task-effects (Unpublished master’s thesis). University of Tübingen, Germany. (purl.org/zweiss/ma-thesis/weiss2017-distr.pdf)
  61. Weiss, Z. , & Meurers, D.
    (2018) Modeling the readability of German targeting adults and children: An empirically broad analysis and its cross-corpus validation. InProceedings of the 27th COLING. Santa Fe, New Mexico, USA. https://www.aclweb.org/anthology/C18-1026
    [Google Scholar]
  62. (2019a) Analyzing linguistic complexity and accuracy in academic language development of German across elementary and secondary school. InProceedings of the 14th BEA workshop. Florence, Italy. 10.18653/v1/W19‑4440
    https://doi.org/10.18653/v1/W19-4440 [Google Scholar]
  63. (2019b) Broad linguistic modeling is beneficial for German L2 proficiency assessment. In A. Abel , A. Glaznieks , V. Lyding , & L. Nicolas (Eds.), Widening the scope of learner corpus research. Selected papers from the fourth learner corpus research conference. Louvain-La-Neuve: Presses Universitaires de Louvain.
    [Google Scholar]
  64. Wöllstein, A.
    (2014) Topologisches Satzmodell (2nd ed.). Heidelberg: Winter.
    [Google Scholar]
  65. Yoon, H.-J.
    (2017) Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct multidimensionality. System, 66, 130–141. doi:  10.1016/j.system.2017.03.007
    https://doi.org/10.1016/j.system.2017.03.007 [Google Scholar]
  66. Yoon, H.-J. , & Polio, C.
    (2016) The linguistic development of students of English as a second language in two written genres. TESOL Quarterly, 275–301.
    [Google Scholar]
  67. Ziai, R.
    (2018) Short answer assessment in context: The role of information structure (Unpublished doctoral dissertation). Eberhard-Karls Universität Tübingen.
    [Google Scholar]
  68. Ziegler, N.
    (2018) Pre-task planning in L2 text-chat: Examining learners’ process and performance. Language Learning & Technology, 22(3), 193–213.
    [Google Scholar]
/content/journals/10.1075/ijlcr.20006.wei
Loading
/content/journals/10.1075/ijlcr.20006.wei
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error