Argument doubling with proper nouns in spoken Dutch A corpus study

Argument doubling, also known as (contrastive) left-dislocation, is common in spoken Dutch, but it is unclear exactly what triggers it. Earlier proposals in the literature showed that the construction is not used for marking contrast, and suggested it is used for marking shifted topics instead. However, the results from a Spoken Dutch Corpus study on argument doubling with proper nouns demonstrate that topic-shift does not adequately characterize the construction’s function either. Further examination of our corpus data shows that at least for proper nouns, Dutch argument doubling mostly occurs when a new referent is introduced into the discourse, but that this referent does not necessarily become the topic of the discourse. We hypothesize that argument doubling is a way of giving speakers and/or hearers some extra time to establish and/or process the new discourse referent in the discourse, regardless of whether it will become a discourse-topic after its introduction.

That (1) sounds exactly the same as ( 2) was discovered by one of the authors of the present paper, whose partner has a colleague named Gertie and a friend named Gert.It resulted in miscommunication, which piqued our interest in this construction.The fact that ( 1) and (2) sound the same shows there need not be a prosodic separation between the antecedent noun phrase and the anaphoric pronoun.This is indeed what de Vries (2009: 294) claims for Dutch, Shaer & Frey (2004: 469) for German, and Shor (2020Shor ( : 1827) ) for Israeli Hebrew.Because we are dealing with an integrated construction here (Matić et al. 2016), the term subject doubling would be more appropriate than left dislocation (Shor 2020).
Another example is presented in (3).Unless indicated otherwise, the examples presented in this article are taken from the Spoken Dutch Corpus (Oostdijk 2000) (interpunction is added by us).
'Pauline (she) always says that she wants to buy an apartment on the coast for (CGN: fv400403) later.' The subject Pauline is realized twice in (3), first by the antecedent noun phrase, and then by the demonstrative pronoun die 'that' in bold.Although the phenomenon may be more common with subjects, it is often illustrated with a case of object doubling (e.g. de Vries 2009: 293), as in (4).
(CGN: fn000793) 'Jeanette you know (her), right?' Since the antecedent noun phrase can be either a subject (as in (3)) or an object (as in (4)), we use the term argument doubling to refer to these constructions.De Vries (2009) analyzes the construction in (4) as a case of topicalization, like (5), but with an additional demonstrative pronoun appositively connected to the peripheral antecedent, which is not dislocated at all, but an integrated part of the clause.
'Jeanette you know, right?' This makes (contrastive) left dislocation crucially different from hanging topic left dislocation, as in (6), where the peripheral object is "clearly separated from the intonational contour of the clause" (de Vries 2009: 294).
(6) Jeanette, je kent haar nog wel hè? 'Jeanette, you know her, right?' In (3) and ( 4), the pronoun die 'that' is adjacent to its antecedent, consistent with its status as appositive (de Vries 2009).This supports de Vries's (2009) claim that the construction is syntactically and prosodically similar to ordinary topicalization, as in (5).However, de Vries ( 2009) points out that when the left peripheral argument is a subject, there is a difference between topicalization and argument doubling.Because subjects are already sentence-initial in the canonical word order in Dutch, they don't have the properties of topicalized constituents.If these properties nevertheless need to be marked, "the way to do it is to insert a demonstrative pronoun, which results in a [contrastive left dislocation] construction" (de Vries 2009: 303).Thus, de Vries suggests that subject doubling takes over the discourse function from topicalization, in order to compensate for the gap in the paradigm.What this discourse function might be according to de Vries and others is discussed below in Section 2.
Neither the prosodic nor the syntactic features of argument doubling in Dutch will be further discussed here.Instead, we focus on the discourse properties of the construction, because in the literature so far there seems to be no unambiguous answer to the question of when argument doubling occurs.Section 2 discusses some previous studies on the functions of left dislocation constructions in English and Dutch, whose hypotheses are tested by means of a Spoken Dutch Corpus study in Section 3. Section 4 discusses the results of this corpus study.Section 5 concludes.

Previous literature on argument doubling in Dutch and English
De Vries (2009) treats topicalization and argument doubling alike, as discussed in the previous section, but argues that the left peripheral argument in neither construction is necessarily a topic, i.e. what the sentence is about.Similarly, Prince (1998) argues that left dislocation in English has nothing to do with topicmarking.This is illustrated with several examples, such as the following passage (the indices are added by Prince 1998: 284): (7) My sister i got stabbed.She i died.Two of my sisters were living together on 18th Street.They had gone to bed, and this man, their girlfriend's husband, came in.He started fussing with my sister i and she i started to scream.The landlady j , she j went up, and he laid her j out.So sister i went to get a wash cloth to put on her j, he stabbed her i in the back.
Prince argues that the landlady in (7) does not function as a topic, since it would not make sense to consider the sentence an answer to the question What about the landlady?, or to paraphrase it with a topic-marker such as As for the landlady.Indeed, "the only relevance the landlady has in the story is that it was to help her that the sister turned away from the angry man, enabling him to stab her in the back" (Prince 1998: 284).Another example discussed by Prince (1998: 285)  In (8) the left dislocated construction is used to introduce a new referent into the discourse, whether the hearer already knows Aunt Katherine or not.Thus, both in ( 7) and ( 8), the left-dislocations crucially introduce new entities in "the current segment of the discourse-model for the first time, regardless of whether the hearer is assumed to already know about them or not" (Prince 1998: 286).Prince argues that the function of left dislocation here is to remove the full noun phrases from the subject position, which is a disfavoured position for discourse-new entities.By creating a separate processing position, the pronouns have become discourse-old when they are encountered in their canonical subject positions within the clause.
Prince identifies two other functions of left dislocated constructions in English.The second function is to trigger an inference on the part of the hearer that the referent represented by the left dislocated noun phase stands in a salient poset (partially ordered set) relation to some entity or entities already present in the discourse model.When there is an opposition in what is predicated of the members of the poset, this gives rise to a contrast relation, but that is not necessary.Prince (1998: 291) notes that the predicates may even give rise to a sort of "counter-contrast".
The third function of left dislocation constructions in English that Prince distinguishes is actually a type of topicalization, but with a resumptive pronoun instead of a gap.This happens when the "extraction site is difficult or impossible for reasons having to do with grammatical processing and where the speakers salvage the situation by leaving a resumptive pronoun in situ" (Prince 1998: 295).Prince considers such examples "topicalization in disguise", and therefore an "illusion of " left dislocation (Prince 1998: 296).Apart from this third syntactically motivated function of left dislocation in English, Prince (1998) thus distinguishes two discourse functions: (i) introducing a new discourse referent in left peripheral position; (ii) establishing a relationship with a previously introduced entity or set of entities.
It is not clear whether these two discourse functions also apply to argument doubling in Dutch.Prince argues that the relation between form and function is language-specific.For example, she points out that Yiddish left dislocation constructions only serve to mark poset inference (the second function of English left dislocation), but not to remove discourse-new noun phrases from subject position (the first function of English left dislocation), because unlike English, Yiddish already has another syntactic construction for that purpose, viz.subject postposing.
Like Prince (1998), de Vries (2009) claims that the left peripheral constituents in Dutch argument doubling constructions need not be topics.However, while Prince (1998) explicitly argued against contrast as a characteristic function of left dislocation in English, de Vries (2009) claims contrast (defined as activated presupposition of alternatives, plus a choice) to be the one and only necessary condition for argument doubling.Thus, while ordinary subjects need not be contrastive, subjects in argument doubling are, which in his view makes the term contrastive left dislocation appropriate as far as it concerns the term contrastive (albeit not the part left dislocation).Thus, according to de Vries's (2009) own intuition, sentences such as (3) and ( 4) above can only be used in a contrastive context.
However, as shown in later research (Stoop 2011;Veeninga et al. 2011), contrast is not a necessary condition for argument doubling in Dutch.( 9) constitutes an explicitly non-contrastive example (Prince's 1998 "counter-contrast"), as indicated by the marker of avoiding contrast ook 'too' (Schmitz et al. 2018).Note that (9) could still be characterized in terms of a poset relation, in the sense that a relationship is established with a previously introduced entity (Prince 1998).
'So now she is at peace with that.And Kees (he) is at peace with it too.' (CGN: fn006899) Stoop ( 2011) conducted a small corpus study in the Spoken Dutch Corpus, and found that argument doubling is indeed frequent in spoken Dutch, since 27 out of 300 occurrences of the pronoun die 'that' (almost 10%) were used for argument doubling.Stoop rejects de Vries's (2009) claim that argument doubling is a variant of topicalization, arguing, among other things, that personal pronouns can be topicalized while they cannot be the antecedent in argument doubling.We will return to this later.
In order to test de Vries's (2009) hypothesis on the role of contrast in argument doubling, Stoop (2011) investigated a sample of 100 randomly chosen instances of argument doubling in Dutch, in 83 cases of which there was no presupposed set of alternatives, hence no relation of contrast.We provide the discourse in (10) from our own corpus study as an example of the absence of contrast in argument doubling.
B: Waarom?A: Ja omdat niet alle mannen vinden dat lekker.Blijkbaar zit daar iets in wat mannen niet zo lekker vinden.B: Mannen?A: Ja, want, ja dat is echt zo, want Carlo die kwam later naar mij toe.Hij zegt: "Het is hartstikke lekker maar d'r zit iets in wat ik niet lekker vind." A: 'Yes she used less of that now.' B: 'Why?' A: 'Yeah, because not all men like that.Apparently, there's something in there that men don't like as much.' B: 'Men?' A: 'Yes, because, yes that's true, because Carlo (he) came to me later.He says: "It's very tasty but there's something in it that I don't like."' (CGN: fn006822) Although one might argue that (10) contains a contrastive relation between men who don't like it (i.e.oregano) and a presupposed set of women who do, the argument doubling does not refer to this contrast.Instead, it refers to Carlo who came to the speaker later, and there is no presupposed set of alternative people who came or did not come to the speaker.While de Vries ( 2009) assumed that contrast is a necessary condition for both topicalization and argument doubling, he also noticed a difference between the two, namely that argument doubling has a "colloquial flavor".This characteristic is emphasized by Stoop (2011), who argues that the main difference between argument doubling and topicalization is that the former is restricted to spoken language.In order to mark shifted topics, topicalization may be used in written language, and argument doubling in spoken language.Stoop concludes that in spoken language the discourse function of argument doubling is twofold: a shifted topic is marked, indicating what the speaker is talking about (sentence topic) and what the speaker wishes to talk about from that point on (discourse topic).Veeninga et al. (2011) conducted a corpus study and an experiment to investigate what the discourse function of argument doubling in Dutch might be.Their corpus study consisted of production data from different groups of participants who narrated four stories based on pictures.Each story contained exactly two animate discourse referents (the stories also contained inanimate discourse referents such as a balloon or an ice cream cone, but these did not play a role in argument doubling or were not checked for it).The argument doubling constructions in their data set were annotated for discourse-newness and contrast.An argument was coded as discourse-new if there was no earlier reference to it, and as discourse-old if there was.A referent was coded as contrastive if the other referent in the story had already been introduced, and as non-contrastive if that was not the case.Veeninga et al. (2011) found that in most cases of argument doubling the referent was discourse-old, although in almost half of the cases (45%) it was discourse-new, and that in most cases (72%) it was non-contrastive.However, their notion of contrast is different from what is normally understood by contrast.In their study, a contrastive referent is the second referent in a narrative with two referents, while a non-contrastive referent is the first referent.Whether or not a contrastive relation is established between the two referents remains an open question.We therefore suppose that the number of occurrences of contrast in argument doubling is even smaller than assumed in Veeninga et al. (2011) (see also Stoop 2011).
It is notable that most of the referents referred to in argument doubling in Veeninga et al. 's (2011) corpus study are discourse-old, that is, they were mentioned in the discourse before.Both definite and indefinite descriptions can be used in argument doubling, as witnessed in ( 11) and ( 12) from our own corpus study.
(CGN: fn007999) 'The Black Diva.That movie I would never want to see (it).' (12) Want een goeie vriend van Lies die woont in Londen.
(CGN: fv400683) 'Because a good friend of Lies (he) lives in London.' For indefinite but not definite descriptions, an alternative, i.e. existential, construction is available for the introduction of a discourse-new referent, hence we predict argument doubling to occur more frequently with definite descriptions than with indefinite descriptions, and therefore more frequently with discourseold referents than with discourse-new referents.
Under the assumption that argument doubling is mainly used with discourseold referents, Veeninga et al. (2011) conducted an experiment to find out whether shifted topics result in argument doubling more often than continuing topics.They presented participants with stories that contained two discourse referents represented by definite descriptions.One example of such a story is (13).
(13) De barman gaat ijs kopen in het plantsoen.De barman vraagt de prins om mee te gaan naar de ijskraam.De barman beschrijft de prins de verschillende smaken onderweg.De barman/prins die _ 'The bartender goes to buy ice cream in the park.The bartender asks the prince to join him to the ice cream stand.The bartender describes to the prince the different flavors along the way.The bartender/prince who/he _' The task of the participants was to complete the last sentence of (13), in which the pronoun die 'that' is ambiguous between a demonstrative and a relative pronoun.Veeninga et al. (2011) hypothesized that the continuing topic the bartender would result in less argument doubling than the shifted topic the prince.However, when participants performed the task orally, they produced argument doubling in 100% of the cases, compared to 43% when they performed the task in writing.
A difference between shifted and continuing topics was found when participants performed the auditory task after performing the written task.Only then participants produced significantly more argument doubling after a shifted topic (80%) than after a continuing topic (75%).Veeninga et al. (2011) suggest that argument doubling is used to facilitate that a discourse-old referent which was not yet the topic becomes the topic of the discourse (shifted topic).However, the evidence for this hypothesis is largely lacking.Veeninga et al. (2011) used only definite descriptions for the referents in their stories.Notoriously, however, definite descriptions are prototypically used for anaphoric, hence discourse-old referents.Moreover, the difference between a shifted and a continuing topic in stories such as (13) comes across as highly unnatural, because continuing topics are normally realized by pronouns, not by a repetition of definite descriptions.A personal pronoun, which is a prototypical continuing topic, can never be the antecedent in argument doubling (Stoop 2011).The next section will test the hypothesis of Veeninga et al. (2011), according to which argument doubling is mainly used with discourse-old referents to mark a shifted topic.To avoid the issues we identified with Veeninga et al. 's (2011) studies, we investigate the discourse features of argument doubling only for proper nouns.In contrast to definite descriptions, proper nouns are never anaphoric and thus can refer to both discourse-new and discourse-old referents.That is to say, even if proper nouns refer to individuals known to the hearer, they can be new in "the current segment of the discourse-model" (Prince 1998: 286).

Method
We searched the whole Spoken Dutch Corpus for sentences in which a proper noun was combined with the demonstrative pronoun die 'that' , and extracted these 4115 occurrences with the 50 words preceding the proper noun and the 50 words following the demonstrative pronoun, in order to be able to make use of the context in our analysis.For the 4115 occurrences we found, we first determined whether it was a case of argument doubling or not.This excluded 2449 cases in which, for example, the pronoun die was used attributively (e.g.Controleert Inge die gesprekken helemaal?'Does Inge check those calls completely?'), or as a relative pronoun (e.g.Dat bijgeluid was Saskia die chips aan het pakken was 'That background noise was Saskia who was grabbing chips').
For the remaining 1666 cases of argument doubling, we annotated whether the argument was (i) a proper noun, to check whether our search criteria correctly yielded proper nouns only; (ii) animate, in line with Veeninga et al. 's (2011) choice for animate discourse referents in their corpus study; (iii) a subject, and (iv) discourse-new.All annotations were made independently by two couples of two annotators each.An initial check of inter-annotator agreement yielded a reliability level of Cohen's κ = .72(p < .001).Any disagreements between the two teams were discussed, and in all cases consensus was reached.To determine whether a referent was new in the discourse or not, the referent was considered discourse-old in the discourse if it had already been mentioned before in the preceding 50-word context.We annotated the proper noun as discourse-old when the proper noun's referent was present in the preceding discourse, regardless of how it was referred to, e.g. by a proper noun or a pronoun.
In addition to these four dimensions, we indicated which predicate was used with argument doubling in case of subject argument doubling.We did not have a specific hypothesis to do so, but Shor (2020) for example found predicates in subject doubling in Israeli Hebrew to be mostly nominal.
For a smaller subset of 551 cases in our dataset, we coded the discourse-topic of the preceding context, and the discourse-topic of the subsequent context.In this way, we could determine whether argument doubling signaled a topic-shift.Note that the argument doubling construction itself did not count towards marking the referent as a discourse-topic or not, because this would be a vacuous definition of topicality.The annotated data set is available in a repository. 1  In our analyses, we only used argument doubling cases in which the proper noun referred to an animate entity, because this made it easier to compare our results with the results of Veeninga et al. (2011), who also only considered argument doubling in the case of animate referents.This means that we excluded 225 cases in which the argument was not a proper noun, e.g. when the proper noun was part of a bigger constituent as in de kinderen van Thea 'Thea's children' , as well as 96 cases where the referent was inanimate.Our final set thus consisted of 1345 occurrences of argument doubling with an animate proper noun.

Results
We found that in 1225 of the 1345 cases, argument doubling occurred in the parts of the Spoken Dutch Corpus containing spontaneous spoken language production (rather than written material read aloud), specifically spontaneous commentaries including sports broadcast on radio and television, spontaneous conversations, and telephone dialogues, thus confirming that argument doubling is a spoken-language phenomenon (de Vries 2009; Stoop 2011;Veeninga et al. 2011).The frequencies of subject and object doubling, and the referent's oldness or newness in the discourse are shown in Table 1 1 shows that argument doubling occurs far more often when the argument is the subject of the sentence (1318 vs. 27), and when the argument is new in the discourse (997 vs. 348).If argument doubling is unrelated to whether an argument is new in the discourse, we expect 62.56% (i.e.841 cases) to be discoursenew and 37.44% (i.e.504 cases) to be discourse-old based on the proportions of discourse-new and discourse-old arguments in our subcorpus of telephone dialogues.A chi-square goodness-of-fit test showed a significant relationship between argument doubling and discourse status (χ 2 (1, N = 1345) = 76.82,p < .001),with doubled arguments more often occurring with discourse-new referents than expected in spontaneous conversations.
The discourse topic of the preceding and following contexts was annotated for a subset of our data, to find out whether argument-doubling facilitates a topicshift, as assumed by Veeninga et al. (2011).The results are summarized in Table 2.
In the vast majority of the cases, the argument was not the topic in the preceding discourse (in 465 of the 551 cases, i.e. 84.39%), which is not surprising because most proper nouns in argument doubling referred to discourse-new entities, which could not have been a topic yet in the discourse.While the argument became the topic in the following discourse in 217 occurrences, more often the argument did not become topic of the discourse following the doubling construction, namely in 248 occurrences.

Discussion
The results of our corpus study reported in the previous section make a clear contribution to the discussion in the literature about the discourse properties of argument doubling in spoken Dutch.Veeninga et al. 's (2011) finding that most referents in argument doubling are discourse-old was not confirmed in our data set.
In fact, we found that the referent was discourse-new in the vast majority of cases, similar to the first discourse function of left dislocation in English, according to Prince (1998).
As shown in Table 2 above, in less than 40% of the cases argument doubling induces a topic-shift, which can therefore not be a main function of argument doubling (contra Stoop 2011 andVeeninga et al. 2011).An example of topic-shift is presented in ( 14), where Merle becomes the topic after her introduction in an argument doubling construction.( 14 Well, there I go.But I was lucky that I was number two, because Merle (she) was really, really scared, even more scared than me.She was one of the last ones and this guy had no more patience, so he opens the gate.' B: 'Push.' A: (CGN: fn007963) 'Gate closed.' In 45% of the cases, a referent who was not the topic in the preceding discourse did not become the topic in the following discourse either.This is illustrated in (15).
(15) A: Nou ja, dan zouden ze wel andere richtingen kunnen uitvliegen maar ja, ze willen speciaal naar Zuid-Afrika.B: Ja nee, Vincent die zegt ook dat ie wel naar wel naar Zuid-Afrika zou willen.Ja, omdat de contacten met de mensen daar worden erg aangetrokken hè, met universiteiten.Dus dan kun je jezelf laten uitnodigen.Dat doen heel veel Nederlanders, de taalkundigen laten zich dan uitnodigen.A: 'Well, they could fly in other directions, but yes, they want to go to South Africa especially.' B: 'Yeah no, Vincent (he) also says that he would like to go to South Africa.
Yes, because the contacts with the people there are very strong, with the universities.So, you can let yourself be invited.Many Dutch people do so, (CGN: fn000249) the linguists have themselves be invited.' As can be seen in Table 2 above, there are even cases (less than 5%) where a topic ceases to be a topic after being used in argument doubling.( 16) is a case in point.
B: Oh Ronald ja.Oh Ashley uh Eshwin Eshwin.Nee is een andere jongen.A: Eshwin nee nee Eshwin die heeft bij mij op de basisschool gezeten.Daar weet ik nog wel alle namen van maar da's ook mensen met wie je meer optrekt dan.A: 'Ronald was his name, a very annoying boy with that orange hair.' B: 'Oh Ronald yes.O Ashley ehm' A: 'Eshwin Eshwin.No, that's another boy.Eshwin, no no, Eshwin (he) went to primary school with me.I still remember all their names, but those are (CGN: fn000525) the people you hang out more with.' In ( 16), Eshwin is already the topic of the discourse before he appears in the argument doubling construction.After that, he is no longer the topic, as the topic of the discourse switches to people with whom the speaker went to elementary school.
Clearly, although argument doubling is mostly used to introduce a new referent into the discourse, it is certainly not only used in such a context.One of the factors that could play a role and should be investigated further in future research is the type of predicate that appears in argument doubling.Shor (2020) found that argument doubling in Israeli Hebrew mostly occurs with nominal predicates.We also seem to find more predicates referring to properties or states than to activities, with verbs such as zijn 'to be' , hebben 'to have' , zitten 'to sit' , but also gaan 'to go' , and doen 'to do' .Besides, we noticed a frequent use of predicates such as zeggen 'say' , where usually what is said is more important than who says it, as in the case of reportative evidentiality (Foolen et al. 2018).Such an example is (17), where Thomas is only introduced in the discourse as the source of information, while another person is the (continuing) topic, referred to by the pronouns hij 'he' and hem 'him' .
(17) A: Wat heeft ie dan?B: Weer?A: Hij … Ja of lag die in het ziekenhuis?B: Nee hij heeft erin gelegen.Hij had eh ik weet ook niet precies hoe het zit, maar Thomas die had hem aan de telefoon.En hij schijnt in elkaar gezakt te zijn.Hij had al een tijdje een beetje spierpijn op z'n borst.A: 'What does he have?'B: ' Again?' A: 'He … Yes, or is he in the hospital?'B: 'No, he has been there.He had ehm I don't know exactly but Thomas (he) had him on the phone.And he seems to have collapsed.He had a little (CGN: fn007030) muscle pain on his chest.'

Conclusion
The aim of this paper was to shed light on the function of argument doubling, a phenomenon which is very common in spoken Dutch.We hope to have shown that argument doubling in Dutch is primarily used to introduce a new discourse referent (contra Veeninga et al. 2011), and not to mark or facilitate a topic-shift (contra Stoop 2011;Veeninga et al. 2011).Introducing a new discourse referent was also argued to be one of the main functions of English left dislocation, in particular for subjects, which usually do not encode new discourse referents (Prince 1998).Since Dutch argument doubling almost exclusively occurs with subjects too, we follow Prince (1998) in her explanation.Because the subject position is dispreferred for the introduction of a new referent into the discourse, argument doubling is used to facilitate processing, for the speaker or the hearer, or both.

Especially Aunt Katherine up here i ,
Everybody talk about it all the time.

Table 1 .
. 2 Contingency table of argument doubling for subjects and objects, and for arguments that are discourse-new and discourse-old

Table 2 .
Cases without argument doubling, with arguments other than proper nouns and with inanimate proper nouns were excluded from the final set.Based on the subcorpus of telephone dialogues in which 386 out of 617 animate proper nouns (62.56%) concerned discourse-new referents, 841 cases should be expected to be discourse-new.Argument doubling with proper nouns in spoken Dutch 257 Frequency of argument doubling signaling a topic-shift 2.
) A: Hij doet dat hekje open.Zeg "Joh gek, doe dat hekje dicht." Hij dat hekje weer dicht.Weer praten en praten en praten, dus hekje weer open.Nou daar ga ik dan.Maar ik had dus geluk dat ik nummer twee was, want Merle die was echt heel erg bang, nog banger dan ik die ze was dus als een van de laatsten en daar had die kerel helemaal geen geduld meer dus die doet dat hekje open.B: Douw.A: Hekje dicht.A: 'He opens that gate.Say "Gee dude, close that gate." He closes the gate again.Again talking and talking and talking, so the gate is open again.