Volume 1, Issue 2
  • ISSN 2542-9477
  • E-ISSN: 2542-9485
Buy:$35.00 + Taxes



While the language of the internet has been an increasingly popular research topic, there remain many understudied areas and topics which deserve more attention. This study explores register variation within the social media website Reddit using the multi-dimensional approach developed by Douglas Biber. Reddit, the third most popular English-language social media website after the giants Facebook and Twitter, is made up of thousands of user-created ‘subreddits’, subcommunities centered around different topics, where users make posts and comment on them. Many different communities and topic areas under one roof makes Reddit a particularly fruitful source of research material. In this paper, three register dimensions are extracted from data collected over one month from a group of thirty-seven subreddits: ‘On-line Subjective Production’, ‘Informational Style’ and ‘Instructional Focus’. These dimensions describe register variation within Reddit in meaningful ways. They are also in line with suggested register universals (Biber 2014).


Article metrics loading...

Loading full text...

Full text loading...


  1. Berber Sardinha, T.
    (2014) Comparing internet and pre-internet registers. In T. Berber-Sardinha & M. Veirano-Pinto (Eds.), Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber (pp.81–105). Amsterdam: John Benjamins. 10.1075/scl.60.03ber
    https://doi.org/10.1075/scl.60.03ber [Google Scholar]
  2. Berber Sardinha, T. , & Veirano Pinto, M.
    (2014) Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber. Amsterdam: John Benjamins. 10.1075/scl.60
    https://doi.org/10.1075/scl.60 [Google Scholar]
  3. Biber, D.
    (1988) Variation across speech and writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  4. (1993) Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257. 10.1093/llc/8.4.243
    https://doi.org/10.1093/llc/8.4.243 [Google Scholar]
  5. (2014) Using multi-dimensional analysis to explore cross-linguistic universals of register variation. Languages in Contrast, 14(1), 7–34. 10.1075/lic.14.1.02bib
    https://doi.org/10.1075/lic.14.1.02bib [Google Scholar]
  6. Biber, D. , & Egbert, J.
    (2015) Using grammatical features for automatic register identification in an unrestricted corpus of documents from the open web. Journal of Research Design and Statistics in Linguistics and Communication Science, 2(1), 3–36. 10.1558/jrds.v2i1.27637
    https://doi.org/10.1558/jrds.v2i1.27637 [Google Scholar]
  7. (2016) Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics, 44(2), 95–137. 10.1177/0075424216628955
    https://doi.org/10.1177/0075424216628955 [Google Scholar]
  8. Biber, D. , & Gray, B.
    (2013) Being specific about historical change: The influence of sub-register. The Journal of English Linguistics, 41, 104–134. 10.1177/0075424212472509
    https://doi.org/10.1177/0075424212472509 [Google Scholar]
  9. Biber, D. , & Kurjian, J.
    (2007) Towards a taxonomy of web registers and text types: A multi-dimensional analysis. In M. Hundt , N. Nesselhauf , & C. Biewer (Eds.), Corpus Linguistics and the Web (pp.109–132). Amsterdam: Rodopi. 10.1163/9789401203791_008
    https://doi.org/10.1163/9789401203791_008 [Google Scholar]
  10. Chandrasekharan, E. , Pavalanathan, U. , Srinivasan, A. , Glynn, A. , Eisenstein, J. , & Gilbert, E.
    (2017) You can’t stay here: The effectiveness of Reddit’s 2015 ban through the lens of hate speech. Proceedings of the ACM on Human-Computer Interaction, 1. doi:  10.1145/3134666
    https://doi.org/10.1145/3134666 [Google Scholar]
  11. Cole, J. R. , Ghafurian, M. , & Reitter, D.
    (2017, November13). Is word adoption a grassroots process? An analysis of Reddit communities. In D. Lee , Y. R. Osgood , & R. Thomson (Eds.), International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (pp.236–241). Berlin: Springer. doi:  10.17605/OSF.IO/JWKXR
    https://doi.org/10.17605/OSF.IO/JWKXR [Google Scholar]
  12. Collot, M. , & Belmore, N.
    (1996) Electronic language: A new variety of English. In S. C. Herring (Ed.), Computer-mediated communication (pp.13–28). Amsterdam/Philadelphia: John Benjamins. 10.1075/pbns.39.04col
    https://doi.org/10.1075/pbns.39.04col [Google Scholar]
  13. Conrad, S. , & Biber, D.
    (Eds.) (2001) Variation in English: Multi-dimensional studies. Harlow: Pearson Education.
    [Google Scholar]
  14. Coscia, M.
    (2018) Popularity spikes hurt future chances for viral propagation of protomemes. Communications of the ACM, 61(1), 70–77. 10.1145/3158227
    https://doi.org/10.1145/3158227 [Google Scholar]
  15. Davies, M.
    (2016) Corpus of Online Registers of English (CORE). Available from corpus.byu.edu/core/
  16. De Choudhury, M. , & De, S.
    (2015) Mental health discourse on Reddit: Self-disclosure, social support, and anonymity. InProceedings of the 8th International Conference on Weblogs and Social Media, ICWSM 2014 (pp.71–80).
    [Google Scholar]
  17. Egbert, J. , Biber, D. , & Davies, M.
    (2015) Developing a bottom-up, user-based method of web register classification. Journal of the Association for Information Science and Technology, 66(9), 1817–1831. 10.1002/asi.23308
    https://doi.org/10.1002/asi.23308 [Google Scholar]
  18. Eisenstein, J.
    (2013) What to do about bad language on the internet. InProceedings of the North American chapter of the Association for Computational Linguistics (NAACL) 2013 (pp.359–369).
    [Google Scholar]
  19. Finlay, S. C.
    (2014) Age and gender in Reddit commenting and success. Journal of Information Science Theory and Practice, 2(3), 18–28. 10.1633/JISTaP.2014.2.3.2
    https://doi.org/10.1633/JISTaP.2014.2.3.2 [Google Scholar]
  20. Friginal, E.
    (2013) Twenty-five years of Biber’s multi-dimensional analysis [Special Issue]. Corpora, 8(2). 10.3366/cor.2013.0038
    https://doi.org/10.3366/cor.2013.0038 [Google Scholar]
  21. Gkotsis, G. , Oellrich, A. , Hubbard, T. , & Dobson, R.
    (2016) The language of mental health problems in social media. InProceedings of the Third Workshop on Computational Linguistics and Clinical Psychology (pp.63–73). Stroudsburg, PA: Association for Computational Linguistics. 10.18653/v1/W16‑0307
    https://doi.org/10.18653/v1/W16-0307 [Google Scholar]
  22. Grieve, J. , Biber, D. , Friginal, E. , & Nekrasova, T.
    (2011) Variation among blog text types: A multi-dimensional analysis. In A. Mehler , S. Sharoff , & M. Santini (Eds.), Genres on the web: Corpus studies and computational models (pp.302–322). New York, NY: Springer.
    [Google Scholar]
  23. Haralabopoulos, G. , Anagnostopoulos, I. , & Zeadally, S.
    (2015) Lifespan and propagation of information in on-line social networks: A case study based on Reddit. Journal of Network and Computer Applications, 56, 88–100. 10.1016/j.jnca.2015.06.006
    https://doi.org/10.1016/j.jnca.2015.06.006 [Google Scholar]
  24. Hardy, J. , & Friginal, E.
    (2012) Filipino and American online communication and linguistic variation. World Englishes, 31(1), 1–19.
    [Google Scholar]
  25. Hess, C. W. , Haug, H. T. , & Landry, R. G.
    (1989) The reliability of type-token ratios for the oral language of school age children. Journal of Speech and Hearing Research, 32, 536–540. 10.1044/jshr.3203.536
    https://doi.org/10.1044/jshr.3203.536 [Google Scholar]
  26. Hess, C. W. , Sefton, K. M. , & Landry, R. G.
    (1986) Sample size and type-token ratios for oral language of preschool children. Journal of Speech and Hearing Research, 29, 129–134. 10.1044/jshr.2901.129
    https://doi.org/10.1044/jshr.2901.129 [Google Scholar]
  27. Huang, Y. , Guo, D. , Kasakoff, A. , & Grieve, J.
    (2016) Understanding US regional linguistic variation with Twitter data analysis. Computers, Environment and Urban systems, 59, 244–255. 10.1016/j.compenvurbsys.2015.12.003
    https://doi.org/10.1016/j.compenvurbsys.2015.12.003 [Google Scholar]
  28. Jonsson, E.
    (2015) Conversational writing: A multidimensional study of synchronous and supersynchronous computer-mediated communication. Frankfurt: Peter Lang.
    [Google Scholar]
  29. Literat, I. , & van den Berg, S.
    (2017) Buy memes low, sell memes high: vernacular criticism and collective negotiations of value on Reddit’s MemeEconomy. Information, Communication & Society. doi:  10.1080/1369118X.2017.1366540
    https://doi.org/10.1080/1369118X.2017.1366540 [Google Scholar]
  30. Manning, C. D. , Surdeanu, M. , Bauer, J. , Finkel, J. , Bethard, S. J. , & McClosky, D.
    (2014) The Stanford CoreNLP natural language processing toolkit. InProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp.55–60). 10.3115/v1/P14‑5010
    https://doi.org/10.3115/v1/P14-5010 [Google Scholar]
  31. McEwan, B.
    (2016) Communication of communities: Linguistic signals of online groups. Information, Communication & Society, 19(9), 1233–1249. 10.1080/1369118X.2016.1186717
    https://doi.org/10.1080/1369118X.2016.1186717 [Google Scholar]
  32. Munro, R. , & Manning, C. D.
    (2012) Short message communications: Users, topics, and in-language processing. InACM DEV ’12 Proceedings of the 2nd ACM Symposium on Computing for Development. doi:  10.1145/2160601.2160607
    https://doi.org/10.1145/2160601.2160607 [Google Scholar]
  33. Park, A. , & Conway, M.
    (2018) Examining thematic similarity, difference, and membership in three online mental health communities from Reddit: A text mining and visualization approach. Computers in Human Behavior, 78, 98–112. 10.1016/j.chb.2017.09.001
    https://doi.org/10.1016/j.chb.2017.09.001 [Google Scholar]
  34. Pavalanathan, U. , Fitzpatrick, J. , Kiesling, S. F. , & Eisenstein, J.
    (2017) A multidimensional lexicon for interpersonal stancetaking. InProceedings of the 55th Annual Meeting of the Association for Computational Linguistics (pp.884–895). doi:  10.18653/v1/P17‑1082
    https://doi.org/10.18653/v1/P17-1082 [Google Scholar]
  35. Revelle, W.
    (2017) psych: Procedures for psychological, psychometric, and personality research (Version 1.7.5). Illinois, USA: Northwestern University. Retrieved from https://CRAN.R-project.org/package=psych
    [Google Scholar]
  36. Richterich, A.
    (2014) ‘Karma, precious karma!’ Karmawhoring on Reddit and the front page’s econometrisation. Journal of Peer Production, 4. Retrieved from peerproduction.net/issues/issue-4-value-and-currency/peer-reviewed-articles/karma-precious-karma/
    [Google Scholar]
  37. Schnoebelen, T.
    (2012) Do you smile with your nose? Stylistic variation in Twitter emoticons. University of Pennsylvania Working Papers in Linguistics, 18(2), 115–125.
    [Google Scholar]
  38. Singer, P. , Ferrara, E. , Kooti, F. , Strohmaier, M. , & Lerman, K.
    (2016) Evidence of online performance deterioration in user sessions on Reddit. PLoS ONE, 11(8). doi:  10.1371/journal.pone.0165852
    https://doi.org/10.1371/journal.pone.0165852 [Google Scholar]
  39. Stewart, I. , & Eisenstein, J.
    (2018, February21). Making “fetch” happen: The influence of social and linguistic context on the success of lexical innovations. arXiv:1709.00345v3 [cs.CL]. 10.18653/v1/D18‑1467
    https://doi.org/10.18653/v1/D18-1467 [Google Scholar]
  40. Titak, A. , & Roberson, A.
    (2013) Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora, 8(2), 239–271. 10.3366/cor.2013.0042
    https://doi.org/10.3366/cor.2013.0042 [Google Scholar]
  41. Tsou, A.
    (2016) How does the front page of the internet behave? Readability, emoticon use, and links on Reddit. First Monday, 21(11). doi:  10.5210/fm.v21i11.7013
    https://doi.org/10.5210/fm.v21i11.7013 [Google Scholar]
  42. Vickery, J. R.
    (2014) The curious case of Confession Bear: The reappropriation of online macro-image memes. Information, Communication & Society, 17(3), 301–325. 10.1080/1369118X.2013.871056
    https://doi.org/10.1080/1369118X.2013.871056 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): internet; multi-dimensional analysis; Reddit; social media
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error