- Home
- e-Journals
- Lingvisticæ Investigationes
- Previous Issues
- Volume 35, Issue, 2012
Lingvisticæ Investigationes - Volume 35, Issue 2, 2012
Volume 35, Issue 2, 2012
-
Seek&Hide: Anonymising a French SMS corpus using natural language processing techniques
Author(s): Pierre Accorsi, Namrata Patel, Cédric Lopez, Rachel Panckhurst and Mathieu Rochepp.: 163–180 (18)More LessThis article presents the system Seek&Hide, a text message processing tool developed for the sud4science LR (http://www.sud4science.org/) project. It performs the anonymisation/de-identification of a corpus. At present, it has been used to anonymise the sud4science LR corpus of French text messages collected during the project. This is done in two phases. In the first phase, it automatically processes over 70% of the corpus. The rest of the corpus is processed in the second phase, aided by an expert annotator via a web interface specifically designed to simplify the task.
-
SMS experience and textisms in young adolescents: Presentation of a longitudinally collected corpus
Author(s): Josie Bernicot, Olga Volckaert-Legrier, Antonine Goumi and Alain Bert-Erboulpp.: 181–198 (18)More LessThe aim of this paper was to study the characteristics of SMSes in a population for which there is currently only limited data: young adolescents (girls and boys) between 11 and 12 years of age. The analysis focused on a corpus of 4,524 SMSes sent by 19 informants in everyday real-life situations over a one-year period. At the beginning of the study, the participants were complete novices. This study sets forth a new analysis grid which distinguishes between two categories of textisms which were defined based on the following cognitive processes: (a) textisms which are consistent with the traditional written code of grapheme–phoneme correspondence and (b) textisms which break with this traditional code. On the whole, the density of textisms was .52 and .26, respectively, for each kind of textism. The results showed an increase in the density of textisms with SMS experience (from month 1 to month 12), but also a variation depending upon the type of textism and the gender of the texter. For boys, the density of both types of textisms increased with SMS experience, while for girls, the density of textisms only increased for textisms which broke with the code. The results were interpreted in terms of the construction of an SMS register with specific linguistic markers resulting from a different use of traditional writing rules or the use of inventions as compared to traditional writing.
-
Automatic or Controlled Writing?: The Effect of a Dual Task on SMS Writing in Novice and Expert Adolescents
Author(s): Céline Combes, Olga Volckaert-Legrier and Pierre Largypp.: 199–217 (19)More LessThe objective of this study was to attempt to distinguish the various processes of producing SMS spelling forms. The production of these different spelling forms was compared by means of an experimental paradigm: the dual task. This paradigm aimed at identifying the attentional resources necessary for the process of producing SMS spelling. Another way in which to address the degree of automation of these production processes was to compare SMS productions in terms of the level of SMS writing expertise. The results of this study demonstrated that the spelling forms produced in SMS language (eSMS), and therefore their production process, differ according to the degree of SMS writing expertise and the attention that the participants are able to devote to the SMS writing task. The results confirm that SMS writing represents a cognitive cost for novice texters and tends to become automatic as the users acquire expertise.
-
Development of SMS language from 2000 to 2010: A comparison of two corpora
Author(s): Úrsula Kirsten-Torradopp.: 218–236 (19)More LessSMS language is regarded as a ‘new’ communication system (cf. T. Shortis 2007; D. Crystal 2009; M. Markus 2010) characterized by new relationships that native speakers establish between English spelling and pronunciation by using different respelling devices (cf. C. Thurlow 2003; P. López Rúa 2007; D. Crystal 2009). This paper is an attempt to contribute to this recent area of study by analysing the development of SMS language over the last 10 years. Recent findings suggest that even though SMS language might have emerged out of the need for speed and brevity (C. Thurlow 2003: 4) — every SMS has a limited amount of characters —, it seems to have evolved into a fashionable and stylish way of writing where shortened versions of the words are not always the aim of the respelling. For the purpose of a diachronic analysis a free online British SMS corpus available at netting-it.com containing 201 text messages compiled in 2000 has been used in comparison with my own data obtained in May 2010 by means of questionnaires carried out in a London secondary school. Thus, it was possible to analyse the differences in the use of SMS language during the last decade. The research proves that one of the most significant changes is the use of ‘stylish talk’, a new device which consists in lengthening words to emphasize accent, slang, and attitude. This contrasts with the general belief that words are merely shortened in SMS language. Moreover, the use of slang and ungrammatical expressions also seem to be frequent devices.
-
Texto4Science: A Quebec French database of annotated text messages
Author(s): Philippe Langlais and Patrick Drouinpp.: 237–259 (23)More LessIn October 2009 the Quebec French part of the international SMS4science project, called texto4science was launched. Over a period of 10 months, we collected slightly more than 7,000 SMSes that we carefully annotated. This database is now ready to be used by the community. The purpose of this article is to relate the efforts put into designing this database and provide some data analysis of the main linguistic phenomena that we have annotated. We also report on a sociolinguistic survey we conducted within the project.
-
SMS communication as plurilingual communication: Hybrid language use as a challenge for classical code-switching categories
Author(s): Étienne Morel, Claudia Bucher, Simona Pekarek-Doehler and Beat Siebenhaarpp.: 260–288 (29)More LessThe use of more than one language in SMS communication is widespread, yet has remained relatively underexplored in the existing research. In this paper we ask: What methodological and conceptual tools are needed for empirically investigating code-switching in large databases of SMS communication? We show that the investigation of SMS communication calls for an adaptation of the conceptual and the methodological apparatus of classical code-switching studies, which have been typically concerned with the analysis of spoken, mostly interactional, data. We argue for a broad understanding of code-switching that comprises switching between natural languages and language varieties along with style shifts as well as switching between language and other semiotic systems (ideographic switching). We also document, as a key feature of SMS communication, hybrid forms of language use that blur the boundaries between what we commonly call languages (e.g. homographs, mixed spellings or allogenisms), and we suggest that these possibly indicate that SMS communication has become one site where the tension between localized and globalized social practices is played out. The study presented here is part of an inter-university research project, entitled “SMS communication in Switzerland: Facets of linguistic variation in a multilingual country”, based on a corpus of 26,000 authentic messages collected between 2009 and 2011.
-
French text messages: From SMS data collection to preliminary analysis
Author(s): Rachel Panckhurst and Claudine Moïsepp.: 289–317 (29)More LessOver a three-month period (spanning 15 September to 15 December 2011), over 90,000 authentic text messages in French were collected by a group of academics in the Languedoc-Roussillon region of France. This paper retraces the organisation of the data collection, the elaboration of the sociolinguistic questionnaire that donors were invited to fill out, text message data processing procedures and preliminary results. A shift from individual “isolated” text messages to “conversational” SMS exchanges is then studied, in preparation for a new SMS conversational data collection which is due to take place in the near future. This whole process is important for understanding in-depth interactional practices within contemporary digital textuality and should provide insight for pluri-disciplinary approaches.
-
A sociolinguistic analysis of transnational SMS practices: Non-elite multilingualism, grassroots literacy and social agency among migrant populations in Barcelona
Author(s): Maria Sabaté i Dalmaupp.: 318–340 (23)More LessFrom the field of the sociolinguistics of globalisation, this article investigates the linguistic features of transnational SMS talk, focusing on the heteroglossic and hybrid multilingual text messaging practices and the ICT-mediated vernacular literacies of a very heterogeneous small group of migrants establishing transnational networks in the outskirts of Barcelona. It shows that migrants employ highly flexible, non-elite linguae francae or “we-codes” for successful inter-group communication which are based on heterography, orality, anti-standardness and transidiomaticity. It also explores the social indexicalities of such SMS practices, and claims that, against a highly ideologised discursive regime which classifies them as “faulty” or “deviant”, transnational migrants’ text messages offer an insight into how these highly mobile citizens attain the necessary degree of social agency to unfold their many transnational identities, re-negotiate their belonging and entitlement to host-society resources, and manage to organise their life trajectories and prospects largely successfully.
-
Negation marking in French text messages
Author(s): Elisabeth Starkpp.: 341–366 (26)More LessThis study investigates the drop of the first clitic element, ne, of French sentential negation. It is based on 4,628 French text messages taken from the newly established corpus sms4science.ch. It shows that regional or stylistic factors do not play a major role in triggering ne deletion or ne retention, and that the only relevant language-internal factor is subject type, more precisely clitic subjects (also in subject doubling structures) and subject drop, which seem to favour or even trigger categorically ne deletion. Our findings are in keeping with those by R. van Compernolle (2008) on European French in online chats, and thereby indicate partially specific regularities in French computer-mediated communication (CMC). The findings question traditional assumptions on ne drop as a variety marker in French, irrespective of the phonic or graphic nature of the data (cf. P. Koch and W. Oesterreicher 2011). In fact, even in graphically deviant data, ne drop seems to follow the observed robust language-internal regularities. Some data seem to indicate that the graphic nature of SMSes plays a role also in ne-drop (cf. similar conclusions by R. van Compernolle 2008 for chats); this has to be checked by examining a larger number of French text messages.
-
“i didn’t spel that wrong did i. Oops”: Analysis and normalisation of SMS spelling variation
Author(s): Caroline Tagg, Alistair Baron and Paul Raysonpp.: 367–388 (22)More LessSpelling variation, although present in all varieties of English, is particularly prevalent in SMS text messaging. Researchers argue that spelling variants in SMSes are principled and meaningful, reflecting patterns of variation across historical and contemporary texts, and contributing to the performance of social identities. However, little attempt has yet been made to empirically validate SMS spelling patterns (for most languages, with the notable exception of French) and verify the extent to which they mirror those in other texts.This article reports on the use of the VARD2 tool to analyse and normalise the spelling variation in a corpus of over 11,000 SMSes collected in the UK between 2004 and 2007. A second tool, DICER, was used to examine the variant and equivalent mappings from the normalised corpus. The database of rules and frequencies enables comparison with other text types and the automatic normalisation of spelling in larger SMS corpora.As well as examining various spelling trends with the DICER analysis it was also possible to place the spelling variants found in the SMS corpus into functional categories; the ultimate aim being to create a taxonomy of SMS spelling. The article reports on the findings from this categorisation process, whilst also discussing the difficulty in choosing categories for some spelling variants.
-
Lol, mdr and ptdr: An inclusive and gradual approach to discourse markers
Author(s): Deniz Uygur-Distexhepp.: 389–413 (25)More LessLOLLaughing Out LoudMDRMort De RirePTDRPéTé De RirelollolmdrptdrThis inclusive and gradual approach allows for doubt and lack of knowledge of the context in the analysis process. More broadly, this work also throws light on texting as a spontaneous computer-mediated communication type with writing constraints imposed by the communication medium and the situation.
Volumes & issues
-
Volume 46 (2023)
-
Volume 45 (2022)
-
Volume 44 (2021)
-
Volume 43 (2020)
-
Volume 42 (2019)
-
Volume 41 (2018)
-
Volume 40 (2017)
-
Volume 39 (2016)
-
Volume 38 (2015)
-
Volume 37 (2014)
-
Volume 36 (2013)
-
Volume 35 (2012)
-
Volume 34 (2011)
-
Volume 33 (2010)
-
Volume 32 (2009)
-
Volume 31 (2008)
-
Volume 30 (2007)
-
Volume 29 (2006)
-
Volume 28 (2005)
-
Volume 27 (2004)
-
Volume 26 (2003)
-
Volume 25 (2002)
-
Volume 24 (2001)
-
Volume 23 (2000)
-
Volume 22 (1998)
-
Volume 21 (1997)
-
Volume 20 (1996)
-
Volume 19 (1995)
-
Volume 18 (1994)
-
Volume 17 (1993)
-
Volume 16 (1992)
-
Volume 15 (1991)
-
Volume 14 (1990)
-
Volume 13 (1989)
-
Volume 12 (1988)
-
Volume 11 (1987)
-
Volume 10 (1986)
-
Volume 9 (1985)
-
Volume 8 (1984)
-
Volume 7 (1983)
-
Volume 6 (1982)
-
Volume 5 (1981)
-
Volume 4 (1980)
-
Volume 3 (1979)
-
Volume 2 (1978)
-
Volume 1 (1977)
Most Read This Month
Article
content/journals/15699927
Journal
10
5
false