Noun phrase complexity in young Spanish EFL learners’ writing Complementing syntactic complexity indices with corpus-driven analyses

The research reported in this article examines Noun Phrase (NP) syntactic complexity in the writing of Spanish EFL secondary school learners in Grades 7, 8, 11 and 12 in the International Corpus of Crosslinguistic Interlanguage. Two methods were combined: a manual parsing of NPs and an automatic analysis of NP indices using the Tool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC). Our results revealed that it is in premodifying slots that syntactic complexity in NPs develops. We argue that two measures, (i) nouns and modifiers (a syntactic complexity index) and (ii) determiner + multiple premodification + head (a NP type obtained as a result of a corpus-driven analysis), can be used as indices of syntactic complexity in young Spanish EFL learner language development. Besides offering a learner-language-driven taxonomy of NP syntactic complexity, the paper underscores the strength of using combined methods in SLA research.


Introduction
The study of noun use in learner writing at different educational or proficiency levels has been informed by previous analyses in the Learner Corpus Research tradition and, to a lesser extent, by studies of complexity, accuracy and fluency (CAF) measures (Larsen-Freeman, 2006;Skehan, 1989). Despite the attention to learner language development through the study of morphology and syntax (Ortega, 2009), the analysis of nouns in learner language has received little attention across these areas, as stated by Ortega (2003). While the study of cross-sectional use of nouns in collocational frames has gained popularity in corpus linguistics studies during the last decade, nouns have not been the primary focus of CAF studies. These studies have generally used complexity indices such as mean lengths (Lu, 2011) mainly involving clauses, T-units and sentences (Wolfe-Quintero et al., 1998). However, these indices have shown to be sensitive to learner performance level. In their study of short-term changes in L2 writing complexity of adult ESL learners, Bulté & Housen (2014: 56) found that their results "yielded no significant development for some of the most popular complexity measures in the L2 literature". In other words, subordination measures and measures of lexical diversity and richness proved of little use in their study. Bulté & Housen suggest that for lower proficiency levels, measurement of less synoptic styles should be in place. They indicate that length of the noun phrase (NP) increased with proficiency level, "pointing to increased use of determiners and modifiers of the NP" (Bulté & Housen, 2014: 50-51). Biber & Gray (2016) have shown that reducing complexity to the analysis of clausal elements is both reductionist and fails to acknowledge the tendency in the 21st century to concentrate less explicit meaning in phrasal contexts. The authors point out that embedded noun phrases create the conditions for more elaborated meanings that, in turn, are "more complex from a processing perspective, than alternative structures with dependent clauses" (Biber & Gray, 2016: 245).
New tools allow researchers to develop analyses of both lexical sophistication and syntactic complexity in ways which favour automatization and the use of a wider range of usage-based indices of syntactic sophistication (Kyle, 2016;Lu, 2017), including phrasal complexity. Despite the huge analytical potential of such indices, studies making use of corpora have largely ignored them. Similarly, the potential of corpora has not always been fully acknowledged in the complexity studies field -even when the use of free-production data for the research of interlanguage has been used by CAF advocates in order to gain a better understanding of the ability to use the language "in real time across communicative contexts" (Ortega, 2009: 111). This paper sets out to provide a cross-sectional study of syntactic complexity in noun phrases in the English as a Foreign Language (EFL) writing of Spanish secondary school students. To do so, differences in the syntactic complexity of the noun phrases written on the same essay topic by students at the two lower and the two higher levels of secondary education are examined. We use a combination of both learner corpus analysis and syntactic complexity measures in order to provide a comprehensive analysis of noun phrases in the writing of young Spanish learners of EFL in instructed settings and a discussion of how NP measures can inform Second Language Acquisition (SLA) theory in general and the study of syntactic complexity in particular. Based on a broad taxonomy model of L2 complexity that understands syntactic complexity at the phrasal, clausal and sentence levels (Bulté & Housen, 2014), we adopt Kyle & Crossley's (2018) conceptualization of syntactic complexity based on fine-grain analyses of noun phrase complexity, as these have been found to be better predictors of writing quality.

Nouns in learner language research
In this section the role played by NPs in the conceptualization of complexity (Section 2.1) and the definition of complexity in the NP (Section 2.2) are considered.

Complexity and NPs
Although CAF studies have been used both as performance descriptors and indicators of learners' proficiency (Housen & Kuiken, 2009), the potential of nouns to generate complexity measures has not yet been fully explored. In fact, recent corpus research in complexity (Alexopoulou et al., 2017;Foster & Tavakoli, 2009) continues to rely on traditional measures such as average sentence length, mean clause length and number of subordinate clauses per T-units. Arguably, the lack of previous work in this area may have prevented researchers from exploring the potential of nouns for the analysis of complexity. One possible reason that may explain this lack of interest in noun-based measures is that nouns are processed by learners earlier than other types of words (VanPatten, 2002), meaning that their analysis can tell us little about interlanguage development. Nouns can be seen in this fashion as a type of basic-level construction that is acquired as easily in both one's L1 (first language) and L2 (second language) (Tomasello, 2003). Ellis et al. (2016) single out nouns as reliable and robust learnable constructions given their high frequency of use and the visual and motor experiences of children when interacting with such categories as those embodied by nouns in basic referential meanings. These meanings are likely to be universal and, in the case of simple NPs, we may presume that most basic noun-driven referential constructions can be transferred across languages, as all languages have word-like constructions to name things and entities (Tomasello, 2003). The implication here is that this transfer occurs effortlessly and nouns are easily acquired as opposed to, say, subordinate clauses or other clausal features.
However, in relation to L2 learning and development, such assumptions in the acquisition of linguistic complexity have been discarded by researchers. Housen et al. (2012) suggest that it is both speculative and simplistic to think that more complex interlanguage (IL) systems necessarily lead to increased accuracy and fluency in the L2. Research in the last decades seems to corroborate this notion. Larsen-Freeman (1997: 151) states that the learning of linguistic items is not a linear process as learners "do not master one item and then move on to another". Although CAF values appear to increase as learners progress through developmental sequences, between-subject and within-subject variation is frequently reported by researchers (Vyatkina, 2012). In particular, individuals' developmental variability changes across time, as shown by Larsen-Freeman (2006) and Verspoor et al. (2008). Studies such as Foster & Tavakoli's (2009), on the other hand, have identified common patterns of complexity in different (L1) groups of learners and native speakers. Vyatkina's (2013) follow-up study of two learners of German in Vyatkina (2012) introduced coordinate nominal phrases as a measure of complexity. In the two subjects reported, these phrases are mainly used to list relatives and describe TV information sources in two different tasks. Her study confirms previous accounts of nonlinearity trends in the learners' progression towards increased communicative competence and suggests that "rarely used specific features such as coordinate structures, complex nominals and nonfinite verb forms can reveal new facets" of the development of syntactic complexity (Vyatkina, 2013: 24). Vyatkina (2013: 24) indicates that complexity through phrasal elaboration is a characteristic of "more advanced writing". However, this finding is not fully endorsed by Byrnes & Sinicrope (2008), who analyse relativization in the production of twenty-three learners of German. They use the NP Accessibility Hierarchy framework developed by Keenan & Comrie (1977). Byrnes & Sinicrope's (2008) findings support the notion that even at lower levels, some of the participants "acquired the most difficult relative clauses by the end of the first year of intensive study of German" (Keenan & Comrie, 1977: 132). The authors show evidence that the OPREP clause type (The small case in which she kept…) is produced in beginners' writing under non-experiment conditions. Judging from the research discussed in this section, it appears that the NP needs to be analysed as a valid syntactic unit to study complexity.

Defining complexity in the NP
Nouns have not been specifically targeted (Rimmer, 2006) unless a register perspective is adopted (Biber & Gray, 2010.  have highlighted that one of, if not the most relevant features of academic English, is the reliance on phrasal structures, particularly complex noun phrases with phrasal modifiers. A different set of studies has examined verb + noun collocations in learner language (Chan & Liou, 2005;Laufer & Waldman, 2011;Luzón, 2011;Tsai, 2015). However, these studies have paid more attention to phraseology than to complexity.
Previous studies have analysed the structure of NPs by considering the constituents of this phrase type, i.e. the head, the determinative, the premodification and the postmodification (Quirk et al., 1985(Quirk et al., : 1238(Quirk et al., -1239 or, using a similar terminology, the head, the determiners, the modifiers (pre and post head) and the complements (Biber et al., 1999). According to Biber et al. (1999), only the headword and the determiners are compulsory elements of the NP. Therefore, NP complexity is defined by the presence of non-compulsory elements in the NP structure and the use of recursive embedding. Thus, for Biber et al. (1999: 576) the following three NPs are increasingly more complex: (i) a study; (ii) a study of intraspecific variability; and (iii) a study of intraspecific variability focused on developmental physiology.
Some automatic tools can analyse complexity in NPs. Among the most commonly used ones are Lu's (2010) L2 Syntatic Complexity Analyzer (https:// aihaiyang.com/software/l2sca/), and TAASSC (Kyle, 2016). The former analyses fourteen measures which consider length of units, coordination, subordination, phrasal sophistication and sentence complexity. Lu (2010) offers indices that account for the length of units (mean length of clauses or sentences), the amount of subordination (dependent clauses per clauses and sentences), amount of coordination per clauses and T-units and degree of phrasal sophistication. This category includes two metrics: complex nominals per clause and complex nominals per T-unit. In the L2 Syntactic Complexity Analyzer tool, complex nominals are defined as noun phrases that include (i) one or more pre-or postmodifier, (ii) nominal clauses or (iii) gerunds and infinitives in subject position. TAASSC calculates different types of indices (Kyle, 2016: 56), as shown in Table 1. with 201 indices. After Varimax rotation, nine components explain the largest amount of the variance on a stratified random sample of 10,000 written texts from COCA. These nine components include sixty indices and explain 56% of the "shared variance in the data for the rotated components" (Kyle, 2016: 71). TAASSC offers researchers the opportunity to extract four compound nounphrase related indices. "NP elaboration" includes nineteen indices that "capture" (Kyle, 2016: 71) noun phrase elaboration by means of measuring prepositions, adjectives, determiners, and verbal modifiers of nominals. Among other metrics NP elaboration computes the number of prepositions per nominal, dependents per nominal or dependents per nominal subject. The second compound index, "Nouns as modifiers and modifier variation", provides detailed information on the use of nouns as nominal modifiers by calculating direct object and nominal subject modifiers, and the variation in the number of modifiers per nominal, including prepositional objects, direct objects, and nominal subjects. The third index, "Determiners", captures the use of determiners in noun phrases in general, including objects of the preposition, direct objects, and nominal subjects in particular. The fourth index, "Possessives", describes the use of possessives in nominal subjects, direct objects, and prepositional objects. Despite the dearth of research in NP complexity, the role of premodifying adjectives in NPs has recently attracted researchers' attention.  and  have examined the writing of twenty-one upper-intermediate/advanced international students in an EAP programme in New Zealand and compared the students' use of noun modifiers with the expert use of the same linguistic feature. They conclude that the learners' reliance on adjectival premodification may be tracked down to "an earlier developmental stage" (Parkinson & Musgrave, 2014: 153) and that the use of noun premodifiers is to be reinforced by means of instruction that deals with this feature of academic English. This finding suggests that as proficiency increases, so does the use of nouns as premodifiers.  find that noun premodification, prepositional phrases and appositive noun phrases are used significantly more frequently by the most proficient TESOL MA learners in their study. The authors suggest that it is essential to focus on nouns as premodifiers and prepositional phrases as postmodifiers in the framework of EAP programs. Paquot (2019) has examined ninety-eight research papers written by French EFL learners at the University of Louvain between 2009 and 2013. She finds that while adjective + noun dependencies show a significant difference in mean mutual information (MI) scores between the B2 and C2 CEFR levels, adjacent levels such as B2-C1 and C1-C2 show no statistically significant differences or, in Hawkins & Filipović's (2012) terminology, no criterial features are found between B2-C1 and C1-C2 levels. Paquot (2019) argues that language development at high pro-ficiency levels is situated in the phraseological complexity dimension, and not so much in syntactic or lexical complexity. This finding suggests that an exploration of NP syntactic complexity is of particular interest in the lowest levels of foreign language (FL) proficiency, which contrasts with the abundance of studies dealing with upper-intermediate and advanced levels of proficiency. Few studies have actually addressed variation and complexity in learner language uses of noun phrases; for example, countable and uncountable nouns (Kobayashi, 2008), articles and uncountable nouns (Osborne, 2004), and article use (Díez-Bedmar, 2010Díez-Bedmar & Papp, 2008;Díez-Bedmar & Pérez-Paredes, 2012;Ionin & Díez-Bedmar, in press;Leńko-Szymańska, 2012).
However, the above-mentioned studies exhibit two important limitations: (i) little information is available on young EFL learner language development, as most of the existing studies have focused mainly on learner writing at university level, very often at advanced levels; (ii) they do not offer a comprehensive picture of the syntactic complexity of noun phrases, thus failing to provide a detailed analysis of noun phrase complexity in learner language. Therefore, it is necessary to characterise NP complexity in learner writing across lower levels in a comprehensive way by complementing syntactic complexity indices and corpus-driven analyses. The overarching research question in this paper is the following:

RQ1. Does NP complexity differ across the years of instruction in Spanish EFL learners?
To provide an exhaustive answer to this research question, four further research questions related to the methods employed are posed: RQ2. Does the use of NP complexity indices in TAASSC reveal differences in NP complexity across the writing by secondary school students? RQ3. Does the manual parsing of NPs and the resulting corpus-driven classification of NP types cast light on differences in NP complexity across the writing by secondary school students? RQ4. Are there criterial features, i.e. distinguishing features, in NP complexity across grades in secondary school Spanish learner writing? RQ5. Do the methods employed in RQ2 and RQ3 complement each other?

Methodology
In this section, information on the learner corpus analysed is provided (Section 3.1). Sections 3.2. and 3.3. describe the two methods employed to analyse NP complexity in the learner corpus. The syntactic complexity indices selected are provided in 3.2. and the manual corpus-driven analyses conducted in Section 3.3. The results of each method are applied in turn (Sections 4.1.1 and 4.1.2, respectively) to then offer a summary of the results obtained with both methods and compare them. The need to complement both methods to analyse NP complexity in learner writing is discussed in Section 4.2 and Section 5, where the answer to the overarching research question in this paper (RQ1) is provided.

Corpus
The data analysed in this study is a subsection of the International Corpus of Crosslinguistic Interlanguage, ICCI (Tono & Díez-Bedmar, 2014). This subsection contains 17,034 words handwritten by Spanish secondary school learners of English in Grades 7, 8, 11 and 12 1 in response to the prompt, "Describe your favourite film. What happens in it?". The number of texts per grade corresponds to the number of informants in each grade, as only one text was written by each student. All NPs in the learner corpus were analysed using TAASSC 1.0 (Kyle, 2016). For the manual parsing of the NPs in the learner corpus, the five most frequently used nouns in each grade were identified using WMatrix (Rayson, 2008). As a result, the NP headwords in Table 2 were manually parsed. Due to the nonnormal distribution of the data, as shown in the Levene test (Section 4.2), nonparametric tests (Kruskall-Wallis tests and Mann-Whitney tests) were run to find out the criterial features regarding NP complexity using both syntactic complexity indices and corpus-driven analyses (Section 4.2). A breakdown of the learner corpus and NPs analysed per grade is shown in Table 3.

1.
These grades correspond to the first and second year in Compulsory Secondary Education, i.e. Educación Secundaria Obligatoria (ESO), and the first and second years in Non-Compulsory Secondary Education, i.e. Bachillerato, in the Spanish education system. The students' ages are 12 (Grade 7), 13 (Grade 8), 16 (Grade 11) and 17 (Grade 12). In the lower grades the students aim at a CEFR A2 level, whereas CEFR B1 level is the target in the higher grades (11 and 12).

Syntactic complexity indices
The first method employed to analyze NP complexity was the measurement of NP syntactic complexity with the aid of TAASSC 1.0 (Kyle, 2016). The use of TAASSC allows us to capture the complexity of nominal structures headed by both nouns and pronouns and the syntactic constituents in Biber et al. (1999). For the purposes of our research (RQ2), we selected the four NP compound indices from Kyle (2016); see Appendix 2.
A pilot study was conducted to test the reliability of the POS tagging and parsing of TAASSC in our learner data. We chose two random essays from Grades 7 and 12 and the automated analysis was manually checked. Accuracy of tagging and parsing was extraordinarily high. POS tagging was 98% accurate with minor issues in areas of unknown or misspelt words such as quidich, incorrectly tagged as an adverb, or the case of lives, tagged as a noun instead of a verb in its original context. The collapsed dependencies analysed all seemed to have been correctly allocated at the 93% accuracy level (Relation -Governor -Dependent), higher than 91% reported in Kyle (2016). The accuracy of the tagging and parsing may be explained by the written, punctuated nature of the language used.

Corpus-driven manual analysis of NPs
The second method employed to analyse NP complexity was the manual parsing of the NPs in the learner corpus (RQ3). Both authors manually parsed the NPs in the study independently. As a result of the Cohen's κ test (κ = .936, 95% CI, p < .0005), an almost perfect agreement between both was observed in their manual parsing. The twenty-nine different types of NPs found in the learner corpus (see Section 4.1.2 and Appendix 1 for an overview of the types, their description and examples) were divided into four NP groups, considering modification use and, if so, the type, as divided into the use of premodification, postmodification, or both: i. Simple NPs and NPs with a determiner or more than a determiner (det NPs); ii. Premodified NPs (prem NPs); iii. Postmodified NPs (postm NPs); and iv. Premodified and postmodified NPs (prem & postm NPs).
As shown in the overview of pattern types in Spanish secondary school learner writing in Appendix 1, when the premodification or the postmodification in the NP was realised by one word (see, for instance, pattern types 6 and 13 in Appendix 1), the terms 'premodification' or 'postmodification' were used, whereas the term 'multiple premodification' or 'multiple postmodification' was employed when they were realised by more than one word (see e.g. pattern types 8 and 16) (Biber et al., 1999).

Results
In this section, we show the results obtained after the analysis of NP complexity in learner writing using both methods. Section 4.1 provides an overview of the findings across grades for each method. Section 4.2 explains the criterial features identified in the analysis of NP complexity across grades with the data obtained thanks to the two methods employed.

Analysing syntactic complexity across grades
Section 4.1.1 describes the outcomes when using the NP indices in TAASSC to answer RQ2. Section 4.1.2 provides an overview of the NP types identified in the learner corpus as a result of the manual parsing of the NPs in the learner corpus and the results across grades to answer RQ3.

Using NP syntactic complexity indices in TAASSC
To answer RQ2, four syntactic complexity indices were analysed using TAASSC: "Noun phrase elaboration" (NPE), "Nouns as modifiers and modifier variation", "Determiners" and "Possessives". NPE measures the number of prepositions, adjectives, determiners, and verbal modifiers per nominal. Table 4 offers the NPE mean across the groups analysed.
The "Nouns as modifiers and modifier variation" (henceforth "Nouns as modifiers") index measures the use of nouns as nominal modifiers by calculating the number of modifiers per nominal, including prepositional objects, direct objects, and nominal subjects. Table 5 offers the "Nouns as modifiers and modifier variation" mean across the groups analysed. "Determiners" is a compound index which measures the use of determiners in noun phrases. The mean of the determiners across the groups analysed is given in Table 6. "Possessives" is a compound index which measures the use of possessives in nominal subjects, direct objects, and prepositional objects. Table 7 shows the mean of the possessives across the four groups.

Manual parsing of NPs: Corpus-driven classification of NP types
To answer RQ3 and after the manual parsing of the NPs in the learner corpus, the NPs were divided into four groups. Table 8 shows the percentage of NP groups per grade. The results show that the most frequently used NP group is det NPs (more than half of the NPs in the learner corpus per grade), followed by prem NPs, postm NPs, and the prem & postm NPs. The study of the NPs analysed across grades revealed the following trends. The percentage of det NPs showed that Grade 7 and Grade 11 learners employed a similar percentage of NPs and there was a decrease in the percentage of prem NPs in Grade 12, when compared to the other grades. Learners in Grade 12 seemed to reduce the percentage of prem NPs and postm NPs to increase the ones in det NPs and prem & postm NPs. In other words, when any type of modification was used, NPs by Grade 12 learners seemed to become more complex in terms of the use of premodification and postmodification in the same NP. The highest percentage of postm NPs was found in Grade 7, whereas the lowest percentage was in Grade 8. The data in Figure 1 suggest that the presence of prem & postm NPs may characterise NPs in learner writing in Grades 11 and 12. Although a couple of examples of this type of NP were found in Grade 8, it is in Grades 11 and 12 that they were most frequent (4.62% and 6.31%, respectively). After providing a bird-eye view of the four NP groups per grade, a detailed analysis of the NP types per NP group is provided. Since learner writing is characterised by both correct and non-target uses of structures, a non-prescriptive description has been undertaken to be faithful to the IL stage in which learners are in each of the grades analysed. Appendix 1 includes the pattern description of the twenty-nine NP types in the learner corpus and a representative example taken from the learner corpus to illustrate each.
The NP group det NPs includes bare NPs as well as NPs in which the head is accompanied by one or more determiners. Table 9 shows the overview of the NPs in this group for each grade with mean and standard deviation of each NP type. The NP types which were also found embedded in prepositional phrases are specified ("prep").  Examples of non-target uses of the language have been found in the analysis of the NPs in the first and the third types. In pattern type 1, simple NPs in the singular, which are non-target structures, have been found in the corpus on six occasions: two in Grade 11 and four in Grade 12. In four out of those six cases, the simple NP was embedded in a postmodifying prepositional phrase. Thus, the embedding of this NP type in a postmodifying structure may have caused the learners' nontarget use of a simple NP in the singular; see Example (1).
(1) Tom, who has a collection of flowers, hasn't a good relationship with ∅ boy (icci_esp0718_Grade 12) In pattern 3, the non-target use of mutually exclusive determiners has been found in six cases, as in Example 2.
(2) (icci_esp0614_Grade 11) The this film Coordinated NPs were not found in Grade 12 learners. Pattern 5, i.e. det + det + head [conjoin], was only found in Grade 8. Learners in Grade 7 did not use any NP with two determiners. Out of the five possible patterns, learners in Grades 7 and 12 only used three NP types, learners in Grade 11 used four, and learners in Grade 8 used all the possible NP types in this group.   Table 10 provides the results of the analysis of the NPs which present any type of premodification (the prem NPs group). There are six possible types of such NPs, two of which were also found embedded in prepositional phrases (prep). The study of prem NPs in the learner corpus points to important features. NP type 7 was the most frequently used in all grades and the only one used in Grade 7. In fact, the lack of variety in premodified NPs in Grade 7 is characteristic of learner writing in that grade. The type of phrases used to premodify the head of the NPs analysed were NPs and AdjP in the case of simple NPs. In the case of NPs with a determiner or more than one determiner, the phrase types used to premodify the head of the NP were NPs, AdjPs and GPs. Coordinated NPs (NP type 11) were found in all grades but Grade 7. Due to its complexity, an interesting NP found in the learner corpus in Grade 12 featured the NP as an apposition of a previous NP (see NP type 10 in Appendix 1). The findings also show that the higher the grade, the more prem NP types were used. However, none of the learner groups used all the NP types included in this group.
The analysis of postm NPs revealed that learners used some non-target structures, possibly due to L1 transfer, to modify the head of the NP. Manual analysis identified the pattern types in Table 11 in the NPs with postmodification. The pattern types embedded in prepositional phrases (prep) are indicated.   Table 11 reveals a number of trends in the use of postm NPs in learner writing in the different grades. The two most frequently used types of postmodified NPs in the learner corpus across grades are NP types 14 and 19, which correspond to two of the correct uses of postmodified NPs. However, these two NP types show opposite patterns: the number of uses of NP type 19 increases with grade, whereas the number of uses of NP type 14 decreases with grade. Non-target NPs, namely, types 13, 15, 16 and 17, were found in Grades 7, 8, 11 and 12, although they were more frequent in Grades 11 and 12. In the case of NP type 16, examples were written in Grades 7, 11 and 12. NP types 15 and 17 were only found in Grade 11 learner writing. The presence of non-target NP types in these grades may point to the learners' risk-taking when writing their descriptions in the FL. There were three NP types which were only found in one grade (NP type 12,NP 15 and NP 17), which may stem from the role played by the input received in those grades or the input that specific learners may have outside the instructional setting. Postmodification by means of multiple phrases was mainly found in Grades 11 and 12. NPs with apposition (types 20 and 21) were more frequent in the lower grades. Finally, the number of NP types used in the four grades increased from Grade 7 to Grades 8 and 11 (5, 6 and 7 NP types, respectively) to then decrease in Grade 12 (6 NP types, as in Grade 8). Therefore, no single learner group showed the ten possible NP types identified in the analysis of postm NPs.  The learner corpus also contains prem & postm NPs, some of which were found embedded in PPs (prep). Table 12 shows the low number of prem & postm NPs types found in the learner corpus. This NP type was mostly used by learners in the upper levels. Grade 7 learners did not use these NP types, Grade 8 learners only used two NP types, Grade 11 learners used four, and Grade 12 learners used seven. The NP types used by Grade 11 learners were characterised by two variables: (i) NP types in which PPs were used to postmodify the head of the NP, regardless of the type of premodification or the complexity of the PP; and (ii) NPs in which only relative clauses were used to postmodify the head of the NP. No learner group used all eight NP types in which both pre-and postmodification are found.

NP complexity in secondary school learner writing: Criterial features
To answer RQ4, statistical tests were run with the results obtained in 4.1.1 and 4.1.2 to find out criterial features across grades in the analysis of NP complexity in secondary school learner writing. The Levene tests conducted on the data, considering the NP complexity indices and the NP types, revealed that the data were non-normally distributed (p < .05), which called for the use of nonparametric tests. An Independent-Samples Kruskall-Wallis Test was conducted to assess if there were differences in the NPE index between the four groups of learners. The results revealed no statistically significant differences (n = 173, H(3) = 5.499, p = .139).
For the "Nouns as modifiers" index, the results revealed statistically significant differences (n = 173, H(3) = 11.387, p = .010, r = .07), although with a small effect size (where r = .1 corresponds to a small effect size, r = .3 to a medium one and r = .5 to a large one) (Cohen, 1988). Subsequent Mann-Whitney tests were run to explore the data further. As a result, a statistically significant difference was found between Grade 8 and Grade 12 learners (U = 1336.000, z = −3.071, p = .002, r = .33), with a medium effect size. Table 13 offers the pairwise comparisons across the groups of learners. The Kruskal-Wallis H test statistic shows the differences between the groups. Larger values point to larger differences. The "Nouns as modifiers" index therefore proves a criterial feature if non-consecutive grades (8 and 12 in this case) are considered (see Table 14). The results of the Independent-Samples Kruskall-Wallis Test on the "Determiners" index revealed no statistically significant differences (n = 173, H(3) = 4.407, p = .221) between the groups of learners. Likewise, no statistically significant differences (n = 173, H(3) = 6.879, p = .076) were found for the "Possessives" index between the groups of learners. Non-parametric tests were also run due to the non-normal distribution of the data (p < .05) to find out if any of the NP types in Section 4.1.2 and Appendix 1 are criterial at any of the four grades considered in this study. The results of the Kruskal-Wallis test revealed five NP types which are statistically significant. Two of the criterial NP types belong to the det NP group: simple NPs (n = 173, H(3) = 7.969, p = .046, r = .05) and NPs composed of a determiner and a head (n = 173; H(3) = 14.629, p = .002, r = .09). The other three criterial NP types belong to the remaining NP groups: a prem NP type, NPs with a determiner, multiple premodification and a head (n = 173, H(3) = 11.746, p = .008, r = .07); a postm NP type, NPs with a determiner, head and postmodification by means of a multiple PP (n = 173, H(3) = 8.363, p = .039, r = .05); and a prem & postm NP type, NPs composed of a determiner, premodification, a head, a relative clause or an apposition (n = 173, H(3) = 7.523, p = .057, r = .04). However, the effect sizes found are small, following Cohen's (1988) criteria for non-parametric tests.
The scenario found reveals that if consecutive grades are analysed, i.e. Grades 7 and 8 and Grades 11 and 12, there is no criterial NP type which distinguishes Grades 7 and 8. However, differences are found between Grades 11 and 12 in the two NP types, namely det + head and det + multiple prem + head. The analysis of non-consecutive grades (i.e. Grades 7 and 11, Grades 7 and 12, Grades 8 and 11, and Grades 8 and 12) reveals that the NP type det + head is criterial between Grades 7 and 12. Overall, the "Nouns as modifiers" index is found to be a criterial feature between Grades 8 and 12. Furthermore, two NP types (NP type 2 and NP type 9) are revealed as criterial features between the two grades in Non-Compulsory Secondary Education, NP type 2 also being criterial between Grades 7 and 12 (see Table 14). We claim that a combination of methods aptly describes NP syntactic complexity in learner writing (RQ5), as the use of only one method would have overlooked important results that help us gain a more comprehensive knowledge of the development of nominal syntactic complexity.

Discussion
The overarching research question in this paper (RQ1) is discussed in the following subsections. Section 5.1 deals with NP types and grades, Section 5.2 delves into the use of complex structures in each NP group and grades, and Section 5.3 tackles the number of non-target-like NP types.

NP types and grades
The number of pattern types used in each NP group/grade reveals different trends. Adding up all the NP types used by each group, we find that, of the twentynine NP types found in the learner corpus, Grade 7 learners use nine NP types, Grade 8 use seventeen types, Grade 11 learners use twenty types and Grade 12 learners use twenty-one types. This finding suggests that the more proficient users seem to display a wider range of NP types.
The number of det NP types and postm NP types used at different grades shows an irregular pattern. The lowest number of det NP types is found in writing by learners in Grades 7 and 12, with the highest number of det NP types in Grade 8. Postm NP types show a steady increase in the number of NP types used until Grade 11, which then decreases in Grade 12. Prem NPs and prem & postm NPs show a steady increase in the number of pattern types used. The difference is that the number of prem NP types reaches its highest in Grade 11 and remains the same in Grade 12, whereas there is a clear and steady evolution from Grade 7 to Grade 12 in prem & postm NP types.
In short, the higher the grade, the more NP pattern types used, especially in prem NPs and prem & postm NPs. Prem NP types and prem & postm NP types deserve a special mention, as these show steady increases from Grade 7 to Grade 12. However, many NP pattern types coexist in the grades analysed. For instance, appositions and relative clauses are used to postmodify the head of the NP in the four grades analysed, appositions showing a higher mean in Grade 7 than in the other grades.

The use of complex structures in each NP group and grade
The analysis of the NPs in the learner corpus reveals findings about syntactic complexity regarding the use of two determiners, coordinated NPs, multiple preand postmodification, appositions and relative clauses. When learners use prem NP types, postm NP types and prem & postm NP types, only one determiner (or no determiner at all) is employed. Two determiners in NPs with determiners are used by all groups but for Grade 7 learners. Coordinated NPs are only found in det NPs and prem NPs, as learners who postmodify the head of the NP or premodify and postmodify the head of the NP do not coordinate such phrases. When det NPs are coordinated, NP pattern type 4 is used by all learner groups except for Grade 12 learners, whereas pattern type 5 is used exclusively by Grade 8 learners.
In the case of coordination in prem NPs, NP pattern type 11 is used by all learners except for those in Grade 7 (who only use premodification pattern type 7, which replicates the prompt in the instructions, i.e. "my favourite film is"). As a result, det NPs coordinated to other phrases are found at all levels but Grade 12 and coordinated prem NPs are used at all levels but in Grade 7. This finding complements previous research carried out by Biber & Gray (2016), who observe phrasal elaboration to be a feature of advanced writers' language. Our results suggest that it may take a long time for phrasal coordination to emerge in NP patterns involving postmodification, so non-relative clause postmodification in particular may pose extra challenges for language learners in instructed settings.
Multiple premodification (i.e. pattern types 8, 9 and 11) is found at all levels except for Grade 7. However, apart from some cases in Grades 8 and 11, it is in Grade 12 that most of the cases of multiple premodification are found (see pattern type 9). Examples with compound adjective phrases functioning as premodifiers of the NP are found (an interesting and very exciting film, icci_esp0691_Grade 12). This is also the case with other structures, such as premodified genitive phrases (main character's friend, icci_esp0716_Grade 12) or premodified adjective phrases (a very fantastic film, icci_esp_0688_Grade 12).
However, an Independent-Samples Kruskall-Wallis Test reveals statistically significant differences between Grade 8 and Grade 12 learners (p = .012) in the "Nouns as modifiers" index. This finding suggests that noun modification of NP heads is widely used by more proficient learners. The fact that there are no significant differences in adjectival premodification confirms previous findings on the emergence of this feature in earlier stages of English language learning (Paquot, 2019). These findings, if confirmed by further studies, may be useful for complex-ity analyses of learner language and thus complement existing clause and sentence level measures. We argue that the use of the "Nouns as modifiers" index and multiple premodification may be used as measures of syntactic complexity in English L2 writing.
Instances of multiple postmodification (i.e. pattern types 15, 16, 17 and 18) are mainly found in Grades 11 and 12. It is worth remembering that some of these pattern types represent non-target uses of English. Multiple postmodification in pattern types in which premodification is also present (i.e. pattern types 23 and 25) is found in Grades 8, 11 and 12. These findings corroborate the non-linear, non-incremental nature of language complexity in general, and of complex noun phrases in particular.
Appositions Relative clauses as shown in pattern type 19 are used by all learner groups. Other structures which include a relative clause, namely pattern types 17 and 21, are used by Grade 11 learners and by Grade 7 and 8 learners, respectively. The use of relative clauses in NPs which also include any type of premodification, i.e. pattern types 26, 27, 28 and 29, are mainly found in Grade 12 writing, examples of pattern types 26 and 27 being found in Grade 11.
The features which reveal "new facets" (Vyatkina, 2013: 24) in learner writing in this study are some complex nominals, i.e. the ones related to multiple premodification, which are found to be criterial between Grades 11 and 12 writing. In particular, nouns used as modifiers are significantly more frequent in Grade 12 than in Grade 8. Other less complex NP pattern types, i.e. type 2, also show differences in learner writing between Grades 11 and 12 and 7 and 12. Noun premodification, prepositional phrases and appositive phrases are found in all grades, but do not characterise learner writing at any level (except for multiple premodification between Grades 11 and 12), which does not support the results in . A trend is found, however, in the use of multiple PPs (as opposed to PPs), since these are only used by Grade 11 and 12 learners.
Most learner groups use more complex structures (i.e. multiple type of modification and appositions) in postmodification environments. In fact, learners in Grades 7, 11 and 12 use multiple postmodification and learners in all grades use apposition in postmodification. Our analysis shows that coordinated NPs and relative clauses are used by all learner groups. Although the results are to be considered cautiously, only NP pattern type 2 and 9 show statistical differences between Grades 11 and 12, and between Grades 7 and 12 for pattern type 2. Similarly, the "Nouns as modifiers" index is statistically significant when comparing Grades 8 and 12, suggesting that it is in premodifying slots that NP complexity indices prove more useful in SLA studies. However, the NPE index shows no significant differences across the four grades analysed. These findings seem to indicate that, at least for Spanish informants in the ICCI, syntactic complexity may take longer to develop in EFL contexts and that, contrary to Tomasello's (2003) suggestion, automatic, incremental transfer of NP syntactic functions such as preand postmodification does not occur across the grades analysed. Our findings challenge previous learner corpus analyses in the English Profile Project, which have revealed that, regarding noun complexity, postnominal modification with -ed, double embedded genitive with of… of, postnominal modification with -ing, double embedded genitive with of…. 's, and relative clauses with whose are criterial at the A2 and B1 levels (UCLES/CUP, 2011: 16-24). In the case of our Spanish informants, postmodification indices show no statistical differences across the four levels analysed. Further analyses of Spanish and other L1 learners should aim to test these findings.

Number of non-target-like NP types used and grade
The analysis of the non-target-like NP types in the learner corpus and the learners who produce them reveals that most of the erroneous NP types are produced by learners in Grades 11 and 12, as they explore their IL producing non-target-like forms, as opposed to their counterparts in lower grades who do not produce nontarget-like forms when advancing their ILs.
In the case of det NPs, non-target uses of pattern type 1 and pattern type 3 are produced by Grades 11 and 12 learners. Grade 8 learners only show two nontarget uses in the structure in pattern type 3. For postm NPs, pattern types 13, 15, 16 and 17 are more frequently used by Grade 11 and 12 learners. Therefore, learners in the higher grades use more non-target structures than those learners in the lower levels. This finding might be explained by Grade 11 and Grade 12 learners' experimentation with language at an advanced IL stage to describe their favourite film. An outcome of their risk-taking is the production of a number of non-targetlike NP types, whereas learners in the lower grades keep to the basic NP types to express themselves in the FL.

Conclusion
The manual parsing of the eight hundred and thirty-two NPs in the learner corpus shows that (i) the use of two determiners is common across all learner groups except for Grade 7; (ii) multiple postmodification is used by most learner groups (Grades 7, 11 and 12), as is the case with multiple premodification (Grades 8, 11 and 12); and (iii) appositions in postmodifying NPs are used by all learner groups. However, appositions in prem & postm NPs are only found in Grade 11 and 12 learners.
Our research highlights the need for using combined methods of analysis that examine the same data from different perspectives. The use of statistical complexity analysis software (Kyle, 2016) has allowed us to account for every single noun and nominal group in the corpus. The range of indices in Kyle (2016) has allowed us to approach syntactic phenomena from a purely quantitative perspective. As a result, we have found that the use of the "Nouns as modifiers" index yields significant differences between Grades 8 and 12, which confirms our finding that premodification slots are of interest for the study of learner language development. The corpus-driven manual analysis of NPs, in turn, has allowed us to gain an indepth understanding of the types of complexity patterns used by learners in the different grades. As a result of this approach, our research has produced a learnergenerated taxonomy of NP syntactic complexity that can be used in studies that examine learner language in other contexts. By combining these two research methods, we hope to make a case for their integration and to enrich methodological pluralism (McEnery & Hardie, 2012;Römer, 2016). Moreover, the findings obtained with the two methods are consistent and thus show promising avenues for collaboration and complementarity.
Two methodological features of this study are worth considering. The finegrained classification of NP types, which includes every NP type found in the corpus, may have determined the results of the statistical analysis: the more detailed the classification of NP, the more likely it is to obtain a low number of instances in some of the NP types. Another feature to be considered is that the manual parsing conducted did not include every single noun in the corpus. This may be seen as a limitation of this study. Another limitation lies in the use of automatic analysis software and POS tagging that was not written primarily to navigate learner language. The impact of these systems on learner-language analysis has rarely been explored in corpus linguistics, and we believe that these software solutions should be sensitive to the range of disfluencies of learner language. If the small number of errors found in the use of automatic tools in learner language are considered tolerable, the automatic analysis of complexity and frequency indices in learner language can be beneficial. Finally, this study has not offered a Contrastive Interlanguage Analysis (CIA) (Granger, 1996(Granger, , 2015 as it is beyond the scope of this paper to look at other L1 learners or English as an L1. Byrnes & Sinicrope (2008) illustrate how SLA studies have paid more attention to clauses and verbs as explicit noun phrase complexity has not attracted the interest of authors wishing to measure students' progress. Our results reveal statistically significant differences between Grade 8 and Grade 12 learners in the quantity and distribution of nouns used as premodifiers in NPs. This finding, together with the two criterial features in NP complexity revealed thanks to the taxonomy obtained as a result of the fine-grained parsing of the NPs in this study, may open up new insights into NP syntactic complexity and language development. This is a new area which has received less attention than the analysis of T-units and other clauses. The fact that our young writers tend to concentrate syntactic complexity on post-modifying slots seems to suggest that premodifying slots are ignored by less proficient learners for the expression of complex meanings. Future research should concentrate on these different slots across different L1 populations to contribute a better understanding of how noun syntactic complexity develops across different groups of language learners.