Volume 21, Issue 2
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes


The study of swearing has increased in the last decade, diversifying to include a wider range of data and methods of analysis. Nevertheless, certain types of data and specifically large corpora of computer mediated communication (CMC) have not been studied extensively. In this paper, we fill a gap in research by studying the use of swearwords in blog data, and illustrate ways of identifying swearing in a large corpus by taking context into account. This approach, based on the examination of shared and unique collocates of known expletives, facilitates the distinction of attestations of swearing from non-swearing in the case of polysemous lexemes, and the analysis of overlaps in usage and meaning of swearwords. This work therefore goes beyond basic sentiment analysis and offers new insights into the use of collocation for refining profanity filters, providing innovative perspectives on issues of growing importance as online interaction becomes more widespread.


Article metrics loading...

Loading full text...

Full text loading...


  1. Aijmer, K. , & Rühlemann, C
    (Eds.) (2014) Corpus Pragmatics. A Handbook. Cambridge: Cambridge University Press.
    [Google Scholar]
  2. Andersson, L. , & Trudgill, P
    (1990) Bad Language. Oxford: Basil Blackwell.
    [Google Scholar]
  3. Angouri, J. , & Tseliga, T
    (2010) “you HAVE NO IDEA WHAT YOU ARE TALKING ABOUT!” From e-disagreement to e-impoliteness in two online fora. Journal of Politeness Research, 6(1), 57–82. doi: 10.1515/jplr.2010.004
    https://doi.org/10.1515/jplr.2010.004 [Google Scholar]
  4. Archer, D. , Culpeper, J. , & Davies, M
    (2008) Pragmatic annotation. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics: An International Handbook (pp.613–641). Berlin: Mouton de Gruyter.
    [Google Scholar]
  5. Beers Fägersten, K
    (2012) Who’s Swearing Now? The Social Aspects of Conversational Swearing. Newcastle upon Tyne: Cambridge Scholars Publishing.
    [Google Scholar]
  6. boyd, d
    (2006) A blogger’s blog: Exploring the definition of a medium. Reconstruction, 6(4). Retrieved fromwww.danah.org/papers/ABloggersBlog.pdf (last accessedFebruary 2016).
    [Google Scholar]
  7. British National Corpus (BNC), XML Edition
    (2007) Distributed by Oxford University Computing Services on behalf of the BNC Consortium.
  8. Butler, C.W. , & Fitzgerald, R
    (2011) “My f***ing personality”: Swearing as slips and gaffes in live television broadcasts. Text & Talk, 31(5), 525–551. doi: 10.1515/text.2011.026
    https://doi.org/10.1515/text.2011.026 [Google Scholar]
  9. Crystal, D
    (1997) The Cambridge Encyclopedia of Language (2nd ed.). Cambridge: Cambridge University Press.
    [Google Scholar]
  10. Firth, J.R
    (1957) Papers in Linguistics 1934–1951. London: Oxford University Press.
    [Google Scholar]
  11. Hardaker, C
    (2010) Trolling in asynchronous computer-mediated communication: From user discussions to academic definitions. Journal of Politeness Research, 6(2), 215–242. doi: 10.1515/jplr.2010.011
    https://doi.org/10.1515/jplr.2010.011 [Google Scholar]
  12. Haugh, M
    (2010) When is an email really offensive?: Argumentativity and variability in evaluations of impoliteness. Journal of Politeness Research, 6(1), 7–31. doi: 10.1515/jplr.2010.002
    https://doi.org/10.1515/jplr.2010.002 [Google Scholar]
  13. Herring, S.C. , Scheidt, L.A. , Wright, E. , & Bonus, S
    (2005) Weblogs as a bridging genre. Information Technology and People, 18(2), 142–171. doi: 10.1108/09593840510601513
    https://doi.org/10.1108/09593840510601513 [Google Scholar]
  14. Hughes, G
    (1998) Swearing: A Social History of Foul Language, Oaths and Profanity in English. Oxford: Blackwell.
    [Google Scholar]
  15. Jay, T. , & Janschewitz, K
    (2008) The pragmatics of swearing. Journal of Politeness Research, 4, 267–88. doi: 10.1515/JPLR.2008.013
    https://doi.org/10.1515/JPLR.2008.013 [Google Scholar]
  16. Jucker, Andreas H. , Schreier, D. , & Hundt, M
    (2009) Corpus linguistics, pragmatics and discourse. In A.H. Jucker , D. Schreier & M. Hundt (Eds.), Corpora: Pragmatics and Discourse. Papers from the 29th International Conference on English Language Research on Computerized Corpora (ICAME 29) (pp.3–9). Amsterdam: Rodopi.
    [Google Scholar]
  17. Jucker, Andreas H
    (2013) Corpus pragmatics. In J.-O. Östman & J. Verschueren (Eds.), Handbook of Pragmatics (pp. 1–18). Amsterdam: John Benjamins. doi: 10.1075/hop.17.cor3
    https://doi.org/10.1075/hop.17.cor3 [Google Scholar]
  18. Kehoe, A
    (2006) Diachronic linguistic analysis on the web using WebCorp. In A. Renouf & A. Kehoe (Eds.), The Changing Face of Corpus Linguistics (pp. 297–307). Amsterdam: Rodopi.
    [Google Scholar]
  19. Kehoe, A. , & Gee, M
    (2007) New corpora from the web: Making web text more “text-like”. In P. Pahta , I. Taavitsainen , T. Nevalainen & J. Tyrkkö (Eds.), Studies in Variation, Contacts and Change in English 2: Towards Multimedia in Corpus Studies. VARIENG E-journal. Helsinki: University of Helsinki. Retrieved fromwww.helsinki.fi/varieng/journal/volumes/02/kehoe_gee (last accessedFebruary 2016).
    [Google Scholar]
  20. (2012) Reader comments as an aboutness indicator in online texts: Introducing the Birmingham Blog Corpus. In S. Oksefjell Ebeling , J. Ebeling & H. Hasselgård (Eds.), Studies in Variation, Contacts and Change in English 12: Aspects of Corpus Linguistics: Compilation, Annotation, Analysis. Proceedings of ICAME 32, VARIENG E-journal. Helsinki: University of Helsinki. Retrieved fromwww.helsinki.fi/varieng/series/volumes/12/kehoe_gee/ (last accessed February 2016).
    [Google Scholar]
  21. Koch, P
    (1999) Court records and cartoons. Reflections of spontaneous dialogue in early Romance texts. In A.H. Jucker , G. Fritz & F. Lebsanft (Eds.), Historical Dialogue Analysis (pp.399–429). Amsterdam: John Benjamins. doi: 10.1075/pbns.66.16koc
    https://doi.org/10.1075/pbns.66.16koc [Google Scholar]
  22. Ljung, M
    (2009) The functions of expletive interjections in spoken English. In A. Renouf & A. Kehoe (Eds.), Corpus Linguistics: Refinements & Reassessments (pp.155–171). Amsterdam: Rodopi. doi: 10.1163/9789042025981_010
    https://doi.org/10.1163/9789042025981_010 [Google Scholar]
  23. (2011) Swearing. A Cross-cultural Linguistic Study. Basingstoke: Palgrave Macmillan.
    [Google Scholar]
  24. McEnery, A
    (2006) Swearing in English. Bad Language, Purity and Power from 1586 to the Present. London: Routledge.
    [Google Scholar]
  25. McEnery, A. , Baker, J.P. , & Hardie, A
    (2000a) Assessing claims about language use with corpus data – swearing and abuse. In J. Kirk (Ed.), Corpora Galore: Analyses and Techniques in Describing English (pp.45–55). Amsterdam: Rodopi.
    [Google Scholar]
  26. (2000b) Swearing and abuse in Modern British English. In B. Lewandowska-Tomaszczyk & P.J. Melia (Eds.), PALC’99: Practical Applications in Language Corpora (pp.37–48). Berlin: Peter Lang.
    [Google Scholar]
  27. McEnery, A. , & Xiao, Z
    (2004) Swearing in Modern British English: The case of fuck in the BNC. Language and Literature, 13(3), 235–268. doi: 10.1177/0963947004044873
    https://doi.org/10.1177/0963947004044873 [Google Scholar]
  28. Mishne, G. , & Glance, N
    (2006) Leave a reply: An analysis of weblog comments. Third Annual Workshop on the Weblogging Ecosystem (WWW 2006).
    [Google Scholar]
  29. Mohr, M
    (2013) Holy Shit. A Brief History of Swearing. Oxford: Oxford University Press.
    [Google Scholar]
  30. Nardi, B.A. , Schiano, D.J. , Gumbrecht, M. , & Swartz, L
    (2004) Why we blog. Communications of the ACM, 47(12), 41–46. doi: 10.1145/1035134.1035163
    https://doi.org/10.1145/1035134.1035163 [Google Scholar]
  31. Nigam, K. , & Hurst, M
    (2004) Towards a robust metric of opinion. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text . Retrieved fromwww.kamalnigam.com/papers/metric-EAAT04.pdf (last accessedFebruary 2016).
    [Google Scholar]
  32. Renouf, A
    (1996) The ACRONYM project: Discovering the textual thesaurus. In I. Lancashire , C. Meyer & C. Percy (Eds.), Synchronic Corpus Linguistics: Papers from English Language Research on Computerized Corpora (ICAME 16) (pp.171–187). Amsterdam: Rodopi.
    [Google Scholar]
  33. Renouf, A. , & Bauer, L
    (2001) Contextual clues to word-meaning. International Journal of Corpus Linguistics, 5(2), 231–258. doi: 10.1075/ijcl.5.2.07ren
    https://doi.org/10.1075/ijcl.5.2.07ren [Google Scholar]
  34. Renouf, A. , & Kehoe, A
    (2013) Filling the gaps: Using the WebCorp Linguist’s Search Engine to supplement existing text resources. International Journal of Corpus Linguistics, 18(2), 167–198. doi: 10.1075/ijcl.18.2.01ren
    https://doi.org/10.1075/ijcl.18.2.01ren [Google Scholar]
  35. Romero-Trillo, J
    (Ed.) (2008) Pragmatics and Corpus Linguistics. A Mutualistic Entente. Berlin: Mouton de Gruyter. doi: 10.1515/9783110199024
    https://doi.org/10.1515/9783110199024 [Google Scholar]
  36. Thelwall, M
    (2008) “Fk yea I swear”: Cursing and gender in MySpace. Corpora, 3(1), 83–107. doi: 10.3366/E1749503208000087
    https://doi.org/10.3366/E1749503208000087 [Google Scholar]
  37. Upadhyay, S.R
    (2010) Identity and impoliteness in computer-mediated reader responses. Journal of Politeness Research, 6(1), 105–127. doi: 10.1515/jplr.2010.006
    https://doi.org/10.1515/jplr.2010.006 [Google Scholar]
  • Article Type: Research Article
Keyword(s): blogs; CMC; collocation; pragmatics; swearing
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error