Volume 4, Issue 2
  • ISSN 2542-9477
  • E-ISSN: 2542-9485
Buy:$35.00 + Taxes



Similar to lexical and grammatical choices, the length of a text is also guided by situational constraints and functional needs. Consequently, texts of different lengths are associated with different communicative functions. This study explores the role of register in the functions which are associated with comment lengths on the social media platform Reddit. Since registers differ in their functional and situational makeup, the same text length may also have different functions in different registers. By analyzing variation in the frequencies of register features across comment lengths in a number of popular subreddits in a large-scale dataset of Reddit comments, I show that the functional associations of text length can differ greatly between subreddits, and that comments of the same length can even have virtually opposite functions in different subreddits. Furthermore, some subregisters are clearly differentiated not only by their feature makeup but also by the length of their comments.


Article metrics loading...

Loading full text...

Full text loading...


  1. Baumgartner, J., Zannettou, S., Keegan, B., Squire, M., & Blackburn, J.
    (2020) The Pushshift Reddit Dataset. Proceedings of the International AAAI Conference on Web and Social Media, 14(1), 830–839. 10.1609/icwsm.v14i1.7347
    https://doi.org/10.1609/icwsm.v14i1.7347 [Google Scholar]
  2. Berber Sardinha, T., & Veirano Pinto, M.
    (2014) Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber. Philadelphia: John Benjamins. 10.1075/scl.60
    https://doi.org/10.1075/scl.60 [Google Scholar]
  3. Biber, D.
    (1988) Variation across speech and writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  4. (1994) An analytical framework for register studies. InD. Biber & E. Finegan (Eds.), Sociolinguistic perspectives on register (pp.31–56). New York: Oxford University Press.
    [Google Scholar]
  5. (2014) Using multi-dimensional analysis to explore cross-linguistic universals of register variation. Languages in Contrast, 14(1), 7–34. 10.1075/lic.14.1.02bib
    https://doi.org/10.1075/lic.14.1.02bib [Google Scholar]
  6. Biber, D., & Conrad, S.
    (2001) Introduction: Multi-dimensional analysis and the study of register variation. InS. Conrad & D. Biber (Eds.), Variation in English: Multi-dimensional studies (pp.3–12). Harlow: Pearson Education.
    [Google Scholar]
  7. (2009) Register, genre, and style. Cambridge: Cambridge University Press. 10.1017/CBO9780511814358
    https://doi.org/10.1017/CBO9780511814358 [Google Scholar]
  8. Biber, D., Csomay, E., Jones, J. K., & Keck, C.
    (2004) A corpus linguistic investigation of vocabulary-based discourse units in university registers. InU. Connor & T. A. Upton (Eds.), Applied Corpus Linguistics: A Multidimensional Perspective (pp.53–72). Rodopi.
    [Google Scholar]
  9. Biber, D., & Egbert, J.
    (2016) Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics, 44(2), 95–137. 10.1177/0075424216628955
    https://doi.org/10.1177/0075424216628955 [Google Scholar]
  10. (2018) Register variation online. Cambridge: Cambridge University Press. 10.1017/9781316388228
    https://doi.org/10.1017/9781316388228 [Google Scholar]
  11. Biber, D., Egbert, J., & Davies, M.
    (2015) Exploring the composition of the searchable web: A corpus-based taxonomy of web registers. Corpora, 10(1), 11–45. 10.3366/cor.2015.0065
    https://doi.org/10.3366/cor.2015.0065 [Google Scholar]
  12. Biber, D., Egbert, J., & Keller, D.
    (2020) Reconceptualizing register in a continuous situational space. Corpus Linguistics and Linguistic Theory, 16(3), 581–616. 10.1515/cllt‑2018‑0086
    https://doi.org/10.1515/cllt-2018-0086 [Google Scholar]
  13. Biber, D., & Gray, B.
    (2013) Being specific about historical change: The influence of sub-register. The Journal of English Linguistics, 411, 104–134. 10.1177/0075424212472509
    https://doi.org/10.1177/0075424212472509 [Google Scholar]
  14. Biber, D., & Kurjian, J.
    (2007) Towards a taxonomy of web registers and text types: A multi-dimensional analysis. InM. Hundt, N. Nesselhauf, & C. Biewer (Eds.), Corpus linguistics and the web (pp.109–132). Amsterdam: Rodopi.
    [Google Scholar]
  15. Clarke, I., & Grieve, J.
    (2017) Dimensions of abusive language on Twitter. InZ. Waseem, W. Hui Kyong, D. Hovy, & J. Tetreault (Eds.), Proceedings of the first workshop on abusive language online (pp.1–10). Vancouver: Association for Computational Linguistics. 10.18653/v1/W17‑3001
    https://doi.org/10.18653/v1/W17-3001 [Google Scholar]
  16. (2019) Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLoS ONE, 14(9). 10.1371/journal.pone.0222062
    https://doi.org/10.1371/journal.pone.0222062 [Google Scholar]
  17. Conrad, S., & Biber, D.
    (Eds.) (2001) Variation in English: Multi-dimensional studies. Harlow: Pearson Education.
    [Google Scholar]
  18. Covington, M. A., & McFall, J. D.
    (2010) Cutting the Gordian knot: The moving-average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100. 10.1080/09296171003643098
    https://doi.org/10.1080/09296171003643098 [Google Scholar]
  19. Egbert, J., Biber, D., & Davies, M.
    (2015) Developing a bottom-up, user-based method of web register classification. Journal of the Association for Information Science and Technology, 66(9), 1817–1831. 10.1002/asi.23308
    https://doi.org/10.1002/asi.23308 [Google Scholar]
  20. Friginal, E.
    (Ed.) (2013) Twenty-five ears of Biber’s multi-dimensional analysis [Special issue]. Corpora, 8(2). 10.3366/cor.2013.0038
    https://doi.org/10.3366/cor.2013.0038 [Google Scholar]
  21. Grice, P.
    (1975) Logic and conversation. InP. Cole & J. L. Morgan (Eds.), Speech acts (pp.41–58). New York: Academic press.
    [Google Scholar]
  22. Grieve, J., Biber, D., Friginal, E., & Nekrasova, T.
    (2011) Variation among blog text types: A multi-dimensional analysis. InA. Mehler, S. Sharoff, & M. Santini (Eds.), Genres on the web: Corpus studies and computational models (pp.302–322). New York: Springer.
    [Google Scholar]
  23. Hess, C. W., Haug, H. T., & Landry, R. G.
    (1989) The reliability of type-token ratios for the oral language of school age children. Journal of Speech and Hearing Research, 321, 536–540. 10.1044/jshr.3203.536
    https://doi.org/10.1044/jshr.3203.536 [Google Scholar]
  24. Hess, C. W., Sefton, K. M., & Landry, R. G.
    (1986) Sample size and type-token ratios for oral language of preschool children. Journal of Speech and Hearing Research, 291, 129–134. 10.1044/jshr.2901.129
    https://doi.org/10.1044/jshr.2901.129 [Google Scholar]
  25. Koizumi, R., & In’nami, Y.
    (2012) Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System, 40(4), 554–564. 10.1016/j.system.2012.10.012
    https://doi.org/10.1016/j.system.2012.10.012 [Google Scholar]
  26. Kubát, M., & Milička, J.
    (2013) Vocabulary richness measure in genres. Journal of Quantitative Linguistics, 20(4), 339–349. 10.1080/09296174.2013.830552
    https://doi.org/10.1080/09296174.2013.830552 [Google Scholar]
  27. Liimatta, A.
    (2019) Exploring register variation on Reddit: A multi-dimensional study of language use on a social media website. Register Studies, 1(2), 269–295. 10.1075/rs.18005.lii
    https://doi.org/10.1075/rs.18005.lii [Google Scholar]
  28. (2020) Using lengthwise scaling to compare feature frequencies across text lengths on Reddit. InS. Rüdiger & D. Dayter (Eds.), Corpus approaches to social media (pp.111–130). Amsterdam/Philadelphia: John Benjamins. 10.1075/scl.98.05lii
    https://doi.org/10.1075/scl.98.05lii [Google Scholar]
  29. (2022) Register variation across text lengths: Evidence from social media. International Journal of Corpus Linguistics. 10.1075/ijcl.20177.lii
    https://doi.org/10.1075/ijcl.20177.lii [Google Scholar]
  30. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D.
    (2014) The Stanford CoreNLP natural language processing toolkit. InProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp.55–60). 10.3115/v1/P14‑5010
    https://doi.org/10.3115/v1/P14-5010 [Google Scholar]
  31. Shi, Y., & Lei, L.
    (2020) Lexical richness and text length: An entropy-based perspective. Journal of Quantitative Linguistics, 29(1), 62–79. 10.1080/09296174.2020.1766346
    https://doi.org/10.1080/09296174.2020.1766346 [Google Scholar]
  32. Titak, A., & Roberson, A.
    (2013) Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora, 8(2), 239–271. 10.3366/cor.2013.0042
    https://doi.org/10.3366/cor.2013.0042 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): functional variation; Reddit; register analysis; social media; text length
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error