Proximate and ultimate explanations of individual differences in language use and language acquisition

I evaluate three schools in linguistics (structuralism; generative linguistics; usage based linguistics) from the perspective of Karl Popper’s critical ratio-nalism. Theories (providing proximate explanations) may be falsified at some point in time. In contrast, metatheories, such as Darwin’s theory of evolution and the theory of Language as a Complex Adaptive System (LCAS) (providing ultimate explanations) are falsifiable in principle , but not likely to be falsified. I then argue that LCAS provides a fruitful framework for the explanation of individual differences in language acquisition and use. Unequal frequency distributions of linguistic elements constitute a necessary characteristic of language production, in line with LCAS. However, explaining individual differences implies explaining commonalities (Hulstijn, 2015, 2019). While attributes such as people’s level of education and profession are visible in knowledge of the standard language (declarative knowledge acquired in school), they may be invisible in the spoken vernacular (linguistic cognition shared by all native speakers).

Uncertainty, that is perhaps the best word with which I should characterize the most frequent feelings that I experienced throughout my intellectual journey, since I entered university until the present.Uncertainty about what to think of a new theoretical viewpoint.What to think of an ongoing debate between scholars defending different theories?What to think of new methodologies?My intellectual journey consisted of ups and downs and the downs were really unpleasant.Most of you, the audience of this talk, are PhD students.You are perhaps already experiencing similar feelings of uncertainty.In the first part of this talk, I provide you with four beacons, concerning fundamental issues in the appraisal of different theories and metatheories, so that you can navigate your own intellectual journey more successfully than I did.I conclude that Language as a Complex Adaptive System (LCAS) is "the" ultimate metatheory for phenomena of language acquisition, language use, and language change.In the second part of this talk, I try to show how fruitful LCAS is for the explanation of individual differences in language acquisition and use.

Proximate and ultimate explanations in linguistics and psychology 1.1 Three paradigms in linguistics and psychology
The first beacon consists of this broad picture of three paradigms (schools) in linguistics and psychology (Figure 1).During my undergraduate years in the late 1960s my linguistics professors were all structuralists (of various European and North-American structuralist schools).Structuralism, which had its roots in 19th-century comparativehistorical linguistics, gave priority to description of languages (with so-called "distributional analysis") rather than explanation of why languages are the way they are (Robins, 1990, Chapter 8).I remember how obsessed my teachers were with definitions and so-called "discovery procedures" that apply bottom-up, starting with reliable observations (Robins, 1990, p. 237).For example, I remember endless and pointless discussions about whether the /g/-like sound in Dutch, as in zakdoek should be classified as a real phoneme or not.There were endless discussions about the question of whether the word green in a sentence like she painted the door green should be called an adjective, adverb, or something else.The scientific method of the time was fundamentally positivist, descriptive rather than explanatory, empiricist, bottom-up, inductive (Bloomfield, 1935, p. 20;Robins, 1990, Chapter 8).This reflected the need to protect true science from pseudoscience, like astrology or Sigmund Freud's psychoanalysis.
In 1970, I enrolled in the graduate program of linguistics and there, one of our instructors, a junior staff member, had heard of Noam Chomsky and of Karl Popper.It was then that my enthusiasm for linguistics and scientific inquiry arose.Let me first say a few things about Karl Popper's critical rationalism and then about Chomsky and generative linguistics.Popper (1902Popper ( -1994)): Critical rationalism Popper's critical rationalism is your second beacon.Popper (1959) argued that it is best to start with a puzzle or problem, which has to be solved.Puzzling phenomena require an explanation (a theory) and a problem requires a solution.Falsifiable hypotheses are derived from the theory and subsequently tested empirically.The findings of the empirical investigation may lead, as indicated by arrow 1 in Figure 2, to a rejection of the theory, or a change, or they are seen as support for the theory.The findings may also provide a new understanding of what we believed to be the puzzling phenomena or problem with which we started, indicated by arrow 2.   (2016).See also Hulstijn (2015), Chapter 1.

Individual differences in language use and acquisition
With respect to both the problem and the findings of the investigation, it is best not to use the word facts but rather the term observations, to remind ourselves of the need to interpret our findings.It is also best to say that there are no theoryfree phenomena or observations.According to Popper (1959), -It does not matter where a claim comes from.
-Claims should be implausible, the bolder the better.
-Knowledge is provisional.
-Theories are tools.
Popper argued that, with his approach to scientific inquiry, we would move from one theory to a better theory, and so on.At that point in time, that is in the 1970s, I believed that there was no end in the chain of theories following each other.This is a crucial point, to which I will come back.Chomsky (born in 1928) did exactly what Popper had demanded: present a theory with bold claims.Chomsky's views appealed to a whole generation of linguists, worldwide.For me, one of the most spectacular ideas in generative linguistics, which Chomsky put forward in the 1980's, was the idea of universal grammar, consisting of "principles" and "parameters" (Chomsky, 1981, p. 3-4).There was a beauty in the idea (as I understood it, perhaps incorrectly, at the time) that, if for example two structures fall under one parameter, it would suffice for children acquiring the language, to be exposed to instances of either of those structures to acquire the other structure -even without exposure to instances of the other structure (Hyams, 1986). 3This was not only bold -this was beautiful!And exciting for researchers of first and second language acquisition.Another ground for my admiration of Chomsky was that, over the many years of his career, he had the courage to reject earlier ideas, replacing them with newer ideas, exactly in the spirit of Popper's cycle of scientific inquiry.It is impossible to do justice, in a few lines, to 60 years of generative linguistics.But for me, these are the essential points.

Chomsky: Generative linguistics
-It adopts a generative-enumerative approach to characterize in the most economical way (top down) language as an infinitely large set of grammatical expressions (Rooryck, 2006).
-In its architecture, syntax takes central position, in between meaning and sound (AUTOSYN, AUTOKNOW, AUTOGRAM, Newmeyer, 1998).-The explanation of language acquisition is based on the assumption of the socalled poverty of the stimulus (Chomsky, 1980, p. 34).-In the more recent version of generative linguistics, the notion of faculty of language is seen as a universal computational system (Chomsky, 2005).Or, as Culicover and Jackendoff (2005, p. 5), put it: Universal Grammar is a "toolkit".

Usage-based linguistics
Towards the end of the 1980s, I had started to supervise my first PhD student, who had obtained her degree in cognitive psychology.And she drew my attention to the seminal Competition Model of Elizabeth Bates and Brian MacWhinney (1989).And not much later my daughter drew my attention to Dynamic Systems Theory in the study of human body performance (Thelen & Smith, 1994). 4So I gradually began to realize that I might be witnessing a second paradigm shift.
First the one from structuralism to generative linguistics and then a new paradigm characterized by neural-network thinking, dynamic systems, cue competition and a lot more.And yes indeed, exactly as the philosopher Thomas Kuhn (1962) predicts, scholars in the new and old paradigm talk straight pass one another and the debates between them do not lead to a compromise.They simply do not share each other's assumptions.Again it is impossible to characterize a whole paradigm in a few lines but these are the features that appealed to me, with Language as a Complex Adaptive System standing out.
-Each language is a complex adaptive system (The Five Graces Group, 2009).
Individual differences in language use and acquisition  2019), Darwin's (1859) views on evolution began as a theory but they have, meanwhile, developed into a metatheory, also called a theoretical framework.While theories address what the ecologist Tinbergen (1963) called proximate questions (about causation and development), metatheories (theoretical frameworks) address ultimate questions (concerned with evolution and function). 5Charles Darwin conceived of nature as a Complex Adaptive System, led by his well-known principles: Variety, Selection (generalized survival), Innovation, and Replication (Van den Bergh, 2018, p. 19).Darwin's views still stand after 160 years of scientific inquiry.His main views have been supported by an impressive amount of empirical research, even after the advent of molecular biology.A crucial question is whether Darwin's evolutionary framework is falsifiable.Van den Bergh (2018, p. 31) answers this question in the affirmative.According to Van den Bergh, Darwin's theory would be falsified, for example: -if physiological and behavioral diversity were lacking in any population or species; -if a species were permanently badly adapted to its environment without becoming extinct; -if the physical basis (DNA) to store and accumulate information did not exist; -if the Earth should prove to be too young to permit an evolutionary unfolding of life.
So it seems to me that what we demand of a theoretical framework/metatheory is that its claims are falsifiable in principle.However, this does not mean that at some point Darwin's claims must and will be falsified.
Similarly, from the view of Language as a Complex Adaptive System, it seems to me that claims can be derived that are falsifiable in principle. 65. Konrad Lorentz and Niko Tinbergen were awarded the Noble Prize in Physiology or Medicine 1973 "for their discoveries concerning organization and elicitation of individual and social behaviour patterns." 6. Language as a Complex Adaptive System is fully compatible with Darwin's theory of evolution.For an excellent overview (and comparison) of different views on the evolution of language, see Tallerman and Gibson (2012).See also Keller (1994), for an evolutionary viable view on language evolution and change, and a lucid comparison of the "invisible hand in language" with Popper's World 3.
-As long as they are being used, languages must change.
-In language users, the mental lexicon and grammar must change (even under attrition in L1ers and fossilization in L2ers).-In multilinguals, languages must affect one another.
-Grammatical structure emerges implicitly in L1 and L2 acquisition, given sufficient input.-There is always competition between simplification and complexification.
-Natural languages must exhibit polysemy, ambiguity, and fuzzy word-class categories.-Language use is characterized by unequal frequencies of linguistic elements.
A typical feature of language is that its lexical units occur in unequal frequencies (Bybee & Hopper, 2001;Ellis, 2002).Crucial for a proper understanding of this phenomenon is that there is not a gradual decrease of frequencies along a straight regression line, but that the curve of raw frequencies shows a sharp decrease first and that it later levels off in a very long asymptotic slope.Grammatical categories have different sizes and size-differences are associated with frequency.For example, in most European languages, the class of pronouns contains relatively few members but most pronouns occur relatively frequently.In contrast, the class of nouns is huge and the large majority of nouns appears only very rarely.The members of small classes, such as pronouns, efficiently compress information.In contrast, in open word classes, such as nouns, there exists semantic hierarchy, so that we can refer to our dog, for example, with expressions that differ in specificity: spaniel (very rare), dog (not so rare), animal (frequent) (Lestrade, 2017).In other words, natural languages must contain ambiguity, fuzzy constructions, and fuzzy word-class categories.And spoken and written discourse must reflect Zipflike unequal distributions of linguistic elements (Piantadosi, 2014), in a push-andpull struggle between pressures to make language more complex and pressures to make language more simple and economical (Mufwene, Coupé & Pellegrino, 2017;Hawkins, 2004Hawkins, , 2014)).All these characteristics, I would like to argue, are falsifiable in principle.In the second part of this presentation, I will come back to this phenomenon of unequal frequencies.
As far as I can tell from the literature, early proponents of Language as a Complex Adaptive System (alternatively named [Complex] Dynamic Systems Theory) in the field of second language acquisition, such as Diane Larsen Freeman, Nick Ellis, and, at the University of Groningen, Paul van Geert, Kees de Bot, Wander Lowie and Marjolijn Verspoor, have not explicitly claimed that this framework is falsifiable (e.g., De Bot, Lowie & Verspoor, 2005;Ellis & Larsen-Freeman, 2009;Larsen-Freeman, 1997;Van Geert, 1994, 1998). 7My proposal is that we make a distinction between (a) falsifiable in principle but not likely to be falsified (which applies to the theoretical frameworks/metatheories of evolutionary biology and Language as a Complex Adaptive System), and (b) falsifiable (which must apply to theories under these frameworks/metatheories).So your third beacon is that the metatheory of Language as a Complex Adaptive System is falsifiable in principle but that the essential claims of Language as a Complex Adaptive System are not likely to be falsified.They will simply remain falsifiable in principle.
For many years, I had thought that usage-based linguistics, emergentism and Language as a Complex Adaptive System were surely to be followed by yet another theory, in a long process.I was waiting for yet another paradigm shift.But I have now come to the conclusion -and this is your fourth beacon -that there will be no next paradigm shift in the study of language use, language acquisition and language change.We will witness new theories under the umbrella of the metatheory of Language as a Complex Adaptive System.Some of those theories will make contradictory claims and via a process of falsification some of them will be rejected, exactly in the spirit of Popper's critical rationalism.I think, from our present retrospective point, we can conclude that the earlier paradigms (structuralism, generative linguistics, behaviorism, first-wave cognitive psychology, listed in Figure 1) were not so much 'wrong' but less successful in explaining fundamental issues in the understanding of language use, language acquisition, and language change.Popper, a strong supporter of Darwin (see Bradie, 2016), would be pleased, I guess, had he witnessed that the metatheory of Language as a Complex Adaptive System had become an ally of Darwin's metatheory of evolution.

Explaining individual differences under the view of LCAS
In the second part of this presentation, I turn to the explanation of individual differences in language use and language acquisition.To what extent can a Complex System view on language help us explain individual differences?
Yes, it is obvious that people differ, in their biological make-up, in their ecological and social environment, in their behaviour, and thus also in the use and proficiency of their native language, or languages.But people do not only dif-7.I would like to emphasize that I find myself on the same page with these eminent scholars.The fact that I add the claim that Language as a Complex Adaptive System is falsifiable in principle but unlikely to be ever falsified, should only be seen as support for their views.
fer, they also exhibit commonalities.A theory of individual differences must also explain commonalities.Where and why do people not differ?

Basic and extended language cognition
In Hulstijn (2015), I proposed BLC Theory as a framework for the investigation of differences and commonalities in language proficiency.I will first give you a visualization of the two main constructs of BLC Theory and then a verbal definition.In Figure 3, first language acquisition is shown as a cone, starting in the first year of life at the bottom of the picture and growing larger with time.The grey area represents the linguistic content that all native speakers acquire. 8This is called BLC, or shared linguistic cognition.But children, adolescents and adults also acquire words, expressions and morphosyntactic constructions that are not shared by the whole language population.This is called extended language cognition.Hulstijn (2011) gives the following verbal definition of BLC.
BLC pertains to (a) the largely implicit, unconscious knowledge in the domains of phonetics, prosody, phonology, morphology and syntax, (b) the largely explicit, conscious knowledge in the [lexical-pragmatic] domain (form-meaning mappings), in combination with (c) the automaticity with which these types of knowledge can be processed.BLC is restricted to frequent lexical items and frequent grammatical structures that may occur in any communicative situation, common to all adult native speakers, regardless of age, literacy or educational (Adapted from Hulstijn, 2011, p. 230.) level.

8.
There is no lid on the cone, to indicate the impossibility of defining maximal language cognition.Language attrition, caused by reduced use of the language, is not visualized in this figure .Recall that language is characterized by Zipf-like, unequal distributions.Let us assume that we computed the raw frequencies of lexical and grammatical elements in a huge corpus of spoken language, truly representative of language, produced, in a wide variety of communicative situations, by people of different ages and different levels of education and profession.The idea then is that BLC pertains to knowledge and use of the elements in the steep part of the distribution of raw frequencies, i.e., to the elements that occur relatively frequently (Hulstijn, 2015, p. 22-24;Hulstijn, 2019, p. 160).

Vernacular versus standard language
Note that BLC refers to the spoken language in everyday life.This is commonly called the vernacular (Pawley & Snyder, 1983).But between ages 4 and 6, children start to attend school (in most countries) and there they are taught the standard language (or languages) of their country or region.They are taught that their language can be rendered in writing.They learn to read and write.They learn the subtle conventions of punctuation and capitalization.In school, children learn what text is, that written text consists of sentences, paragraphs and larger units.They learn that there are many different text genres, with their own characteristics.All this requires the learning of a large body of declarative knowledge, including metalinguistic knowledge.
Not all students in elementary and secondary school are equally successful in learning these features of the standard language.Level of Education and Profession and language-related leisure-time activities are potentially related to amount and types of literacy experiences.In societies with compulsory education for all children, all typically developing children learn to read and write, at least at a basic level.But not all people read and write to the same extent.Many so-called aliterate people (not illiterate but a-literate people), limit themselves to reading and writing short messages, for example, in social media.So the content of extended language cognition is likely to vary by literacy-related attributes (Hulstijn, 2019, p. 164).This is an empirical question and BLC Theory offers a framework for conceptualizing this.As I argued in Hulstijn (2019), BLC Theory is a framework for investigating people's linguistic cognition, as a function of their memberships in extra-linguistically-defined groups, such as people differing in age, literacy, education, profession, working-memory capacity, executive functions, intelligence, or personality.
2.3 Individual differences work out differently in control of the vernacular and in control of the standard language Language production (spoken or written) of each person, regardless of age or level of education and profession, exhibits variance at all levels of linguistic analysis, with regard to both type and frequency of linguistic elements.Language production varies, in the use of members of relatively large word classes such as nouns and verbs, in the use of grammatical constructions, in the internal structure of constituents (such as noun phrases), in clause types (e.g., relative clauses, wh-cleft sentences), in clause length.And language production varies in the frequency with which particular words (both single-word and multi-word units) and constructions are being used.For example, it would be impossible to give a verbal account of, say, what you did last weekend, in such a way that all sentences consisted of two clauses, that each clause consisted of six words, that each word consisted of five phonemes, and that all words belonged to the same frequency band (e.g., in a huge corpus of spoken language, produced by a large variety of individuals).
The variance in all forms of language production reflects inherent properties of Language as a Complex Adaptive System.For example, in describing what you did last weekend, you must use members of different word classes.You will use articles, personal and deictic pronouns, conjunctions and prepositions.The most frequent members of these word classes are generally short (Haspelmath, submitted).You must also use members of open word classes, such as nouns, verbs and adjectives, that differ in frequency because they differ in semantic specificity (Lestrade, 2017) and infrequent nouns are more likely to consist of more morphemes than frequent ones (e.g., bike [more frequent, one morpheme] versus three-wheeler [less frequent, three morphemes]).Clause length will also differ, depending on syntactic verb patterns, as in I think │that he had given her a vase with flowers for her birthday (2 and 12 words).Thus there is inherent variance in (almost) all forms of language production.In the study of individual differences in language use, it is mandatory to distinguish between within-person variance and between-person variance.Some of the between-person variance may be associated with attributes such as age, level of education etc.But some of the betweenperson variance may not be associated with such attributes.Let me give an example of these different types of variance from a study of individual differences in language cognition of 98 adult native speakers of Dutch, who differed in age (18-76) and level of education/profession (EP), reported in Mulder and Hulstijn (2011) and Hulstijn (2017).Participants performed a battery of 11 language-related tasks, consisting of (1) four computer-administered speed tasks, measuring word association, auditory lexical decision, visual lexical decision, and picture naming, (2) a paper-and-pencil vocabulary knowledge test, (3) an auditory and a visual word-span task, and (4) four speaking tasks.Some individual differences were associated with EP.For example, in comparison to low-EP participants, high-EP participants performed significantly better in the vocabulary test, the word-association test and the auditory working-memory test.In the speaking tasks, they produced longer responses, they produced responses more successful in information and argumentation, and they produced proportionally fewer grammatical errors (Mulder & Hulstijn, 2011).However, in the speaking tasks, no significant associations with EP were observed in the individual differences in clause length or in the use of a range of syntactic patterns, such as the use of relative clauses, wh-cleft sentences, center-embedded clauses, and passives (Hulstijn, 2017).
So it seems to me, that the study of individual differences should take into account the characteristics of Language as a Complex Adaptive System, evident in the inherent variance in a person's language production.Language production is probabilistic and does not directly reflect a person's knowledge because knowledge (of, for example, fronted object clauses) is gradient: the construction is more or less entrenched in the person's mental grammar (Hulstijn, 2019, p. 171-174).In contrast, performance in a conventional grammar test (including items assessing knowledge of fronted object clauses), is scored dichotomously: the response either reflects correct knowledge or it does not.High-EP people are likely to perform significantly better than low-EP people in conventional tests of linguistic knowledge but the lexical/grammatical complexity of spoken language produced by High-and low-EP people may not differ, because of the inherent (probabilistic) variance in language production.The challenge for future research is to tease apart (a) the noise in every person's language production, resulting from inherent probabilistic variance (LCAS), from (b) differences in language production reflecting genuine differences of linguistic cognition.
Furthermore, personal attributes may be associated with language acquisition in different ways.While, in a population of people not affected by languagerelated disorders, individual differences in factors like speed of information processing, working-memory capacity and intelligence may impact the rate of language acquisition (BLC and ELC) during childhood and later in life, they cannot, by definition, be associated with the control of BLC (once attained) simply because, by definition, BLC (the vernacular) is shared by all adult native speakers, fast or slow, smart or not so smart.In contrast, attributes such as level of education, profession, and leisure-time activities (including literacy practices) will only be associated with the control of ELC, not with the control of BLC. 9 9.For a slightly different but compatible complex-system view on individual differencesbased on Complex Dynamic Systems Theory -see Lowie and Verspoor (2019).

Language assessment
Before rounding off this talk, let me briefly address the question of assessment measures.We assess language proficiency in clinical settings, in educational settings, and in language acquisition research.I would like to propose that, in choosing appropriate tests, we keep in mind two things.First, the fact that assessment of individual differences almost always implies the assessment of commonalities (BLC: shared linguistic cognition).Second, it may be useful to consider the possibility that causal effects of attributes like level of education and profession may be visible more in the use of the standard language than in the use of the vernacular.In second-language acquisition research we have the custom of what I call the checklist method.From a grammar we compile a list of morphosyntactic structures and we then check, one-by-one, whether language learners have acquired these structures, often in so-called developmental orders (Hulstijn, Ellis, & Eskildsen, 2015).I would like to see that this kind of assessment be complemented by, and embedded in various computations indicating the state of the mental grammar as a whole, as an emerging complex adaptive system.For example, one might obtain a grammatical 'fingerprint' , probabilistically computed on the basis of a speech sample (Bod, 2015), using software algorithms designed in the field of author identification (digital text forensics) and author profiling (Neal, Sundararajan, Fatima, Yan, Xiang, & Woodard, 2018;Potha & Stamatatos, 2019). 10

Conclusions
At the beginning of this talk I spoke of my feelings of uncertainty which accompanied me during my intellectual journey in the study of language.Uncertainty is part and parcel of working in scientific inquiry.I hope I have given you some beacons for your own intellectual itinerary.Here are my conclusions.
-Language as a Complex Adaptive System is probably "the" ultimate metatheory for language use, language acquisition and language change.-Therefore, theories of language use, language acquisition and language change ought to be in line with the metatheory of Language as a Complex Adaptive System.-While theories may be falsified at some point in time, Language as a Complex Adaptive System is falsifiable in principle, but unlikely to be falsified.
10.For a recent attempt to use Complex Dynamic Systems Theory in classroom settings of second language instruction, see Han, 2019.
-A theory of individual differences in language use and language acquisition is simultaneously a theory of commonalities.Individual differences may work out differently in control of the vernacular and in control of the standard language.
Thank you for your attention.

Figure 2 .
Figure 2. The scientific cycle

Figure 3 .
Figure 3. Basic and extended language cognition Theory and metatheory: Proximate and ultimate explanations It took me many years of uncertainty to understand that we have to distinguish between theory and metatheory and that Language as a Complex Adaptive System is a metatheory, commensurate with Darwin's theory of evolution.Let me explain what this really means.According to Muthukrishna and Henrich ( 1.5