Patterns of lexis on the surface of texts
Editors’ introduction Like Darnton in this volume, Coulthard is interested in the practical uses that can be made of the phenomenon of repetition in text. His concern is with textual plagiarism, which he describes as involving texts in a Matching relation that is intended to remain undetected. This is more than simply a witty choice of phrasing: as noted for example in our Introduction, Matching relations rely on repetition, and, in many cases, plagiarists repeat not only the ideas but the wordings of the texts that they plagiarise. Identifying lexical repetition between texts is thus a practical step towards identifying possible cases of plagiarism. Coulthard explores plagiarism in three different areas: literary texts, student essays, and police records of interviews with and statements by suspects. He deals with two major issues: detection and directionality. Detection of plagiarism or unauthorised collaboration between writers can be difficult when, for example, a teacher is faced with large numbers of essays to mark — and even more so when the marking may be shared out amongst different teachers. This is where the occurrence of repetition of lexical items can be exploited. Whereas for Darnton’s purposes what is important is repetition in context (essentially, the repetition — with some changes — of whole sentences rather than of individual words), Coulthard shows that for his purposes measuring the percentage of vocabulary items shared by any two texts is sufficiently revealing. This has the advantage that it can be calculated automatically by computer. When a particularly high level of sharing is noted, the texts can be pulled out and subjected to individual scrutiny to confirm whether plagiarism is indeed involved. <br /> Once plagiarism is identified, the issue of directionality may arise: that is, determining which is the original text and which is the plagiarised one. With published texts this is normally a simple matter, since the chronology can be decided by date of publication; but with student essays and police records the analyst will typically need to rely on evidence in the texts themselves. Coulthard discusses various methods by which directionality can be established. At this point, his focus switches from repetition between the texts to cohesion — repetition and conjunction — within each text: that is, to the ways in which the texts are organised and the organisation is signalled. He demonstrates that his approach can be used to illuminate the process by which a supposedly independent text has in fact been derived from another — a process which may have extremely serious implications in legal cases. <br /> Underlying any discussion of plagiarism is the question of ‘voices’: how far is it possible to identify a writer’s personal voice, or style, and to detect places where that voice is overlaid or replaced by the voice of another? Coulthard argues that one way of approaching this question is through repetition — that texts (and the body of texts produced by each writer) have their own norms in terms of the language choices that the writers make. This raises an interesting comparison with Scott’s paper in this volume: the two papers can be seen as complementary in certain respects. If Scott deals with the ‘aboutness’ of texts and highlights what texts have in common despite their diversity, Coulthard’s study might be characterised as dealing with the ‘who-ness’ of texts and highlighting essentially what makes texts distinctive despite their similarities.