Volume 4, Issue 2
  • ISSN 2542-9477
  • E-ISSN: 2542-9485
Buy:$35.00 + Taxes



This paper introduces an initial text typology of social media posts from a multi-dimensional (MD) perspective. Text types are “[g]roupings of text that are similar in their linguistic form” (Biber 1989: 13). This text typology is based on a new MD analysis of social media messages presented in the paper. The corpus consists of 60,000 social media messages in English compiled from Facebook, Twitter, Instagram, Reddit, Telegram, and YouTube. After the texts were cleaned up, the corpus was tagged with the Biber Tagger and post-processed with the Biber Tag Count. Three dimensions of variation were determined, each representing an underlying parameter of variation. Once the texts were scored on each of the dimensions, a k-means cluster analysis was carried out, and the optimal number of clusters was determined using the Cubic Clustering Criterion statistic. A two-way typology was developed based on the dimensional characteristics of each cluster and on careful qualitative analysis of text samples.


Article metrics loading...

Loading full text...

Full text loading...


  1. Adam, J. M.
    (2011) A linguística textual – Introdução à análise textual dos discursos [La linguistique textuelle. Introduction à l’analyse textuelle des discours] (M. D. G. Rodrigues, J. G. D. Silva Neto, L. Passeggi, & E. F. Leurquin, Trans.). São Paulo: Cortez.
    [Google Scholar]
  2. Beaugrande, R. A. D., & Dressler, W. U.
    (1981) Introduction to text linguistics. London: Longman. 10.4324/9781315835839
    https://doi.org/10.4324/9781315835839 [Google Scholar]
  3. Berber Sardinha, T.
    (2014) Comparing Internet and pre-Internet registers. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber (pp.81–107). Amsterdam/Philadelphia: John Benjamins. 10.1075/scl.60.03ber
    https://doi.org/10.1075/scl.60.03ber [Google Scholar]
  4. (2017) Text types in Brazilian Portuguese: A multidimensional perspective. Corpora, 12(3), 483–515. 10.3366/cor.2017.0129
    https://doi.org/10.3366/cor.2017.0129 [Google Scholar]
  5. (2018) Dimensions of variation across Internet registers. International Journal of Corpus Linguistics, 23(2), 125–157. 10.1075/ijcl.15026.ber
    https://doi.org/10.1075/ijcl.15026.ber [Google Scholar]
  6. (2022) Corpus linguistics and the study of social media: a case study using multi-dimensional analysis. InA. O’Keeffe & M. McCarthy (Eds.), The Routledge handbook of corpus linguistics (pp.656–674). New York: Routledge. 10.4324/9780367076399‑46
    https://doi.org/10.4324/9780367076399-46 [Google Scholar]
  7. Berber Sardinha, T., Kauffmann, C., & Acunzo, C. M.
    (2014) Dimensions of register variation in Brazilian Portuguese. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber (pp.35–80). Amsterdam/Philadelphia: John Benjamins. 10.1075/scl.60.02ber
    https://doi.org/10.1075/scl.60.02ber [Google Scholar]
  8. Berber Sardinha, T., & Shimazumi, M.
    (2021) A text typology of argumentative essays based on the new ICLE v.3. Paper presented at the11th International Corpus Linguistics Conference 2021, Limerick, Ireland.
    [Google Scholar]
  9. Berber Sardinha, T., & Veirano Pinto, M.
    (Eds.) (2014) Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber. Amsterdam/Philadelphia: John Benjamins. 10.1075/scl.60
    https://doi.org/10.1075/scl.60 [Google Scholar]
  10. (Eds.) (2019) Multi-dimensional analysis: Research methods and current issues. London: Bloomsbury Academic. 10.5040/9781350023857
    https://doi.org/10.5040/9781350023857 [Google Scholar]
  11. (2021) A linguistic typology of American television. International Journal of Corpus Linguistics, 26(1), 127–160. 10.1075/ijcl.00039.ber
    https://doi.org/10.1075/ijcl.00039.ber [Google Scholar]
  12. Biber, D.
    (1988) Variation across speech and writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  13. (1989) A typology of English texts. Linguistics, 271, 3–43. 10.1515/ling.1989.27.1.3
    https://doi.org/10.1515/ling.1989.27.1.3 [Google Scholar]
  14. (1993) Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257. 10.1093/llc/8.4.243
    https://doi.org/10.1093/llc/8.4.243 [Google Scholar]
  15. (1995) Dimensions of register variation – a cross-linguistic comparison. Cambridge: Cambridge University Press. 10.1017/CBO9780511519871
    https://doi.org/10.1017/CBO9780511519871 [Google Scholar]
  16. Biber, D., & Egbert, J.
    (2018) Register variation online. Cambridge: Cambridge University Press. 10.1017/9781316388228
    https://doi.org/10.1017/9781316388228 [Google Scholar]
  17. Biber, D., & Kurjian, J.
    (2007) Towards a taxonomy of web registers and text types: a multi-dimensional analysis. InM. Hundt, N. Nesselhauf, & C. Biewer (Eds.), Corpus linguistics and the web (pp.109–132). Amsterdam / New York: Rodopi.
    [Google Scholar]
  18. Bronckart, J. P.
    (1999) Atividades de linguagem, discursos e textos [Language activities, discourses and texts] (A. R. Machado, Trans.). São Paulo: EDUC.
    [Google Scholar]
  19. Charaudeau, P.
    (2009) Linguagem e discurso: Modos de organização [Langage et Discours – Eléments de sémiolinguistique] (A. M. S. Correa, Trans.). São Paulo, SP: Contexto.
    [Google Scholar]
  20. Clarke, I.
    (2020) Linguistic variation across Twitter and Twitter trolling. (PhD Dissertation). University of Birmigham, Birmingham.
  21. (2022) A Multi-dimensional analysis of English tweets. Language and Literature. Advance online publication. 10.1177/09639470221090369
    https://doi.org/10.1177/09639470221090369 [Google Scholar]
  22. Clarke, I., & Grieve, J.
    (2019) Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLOS ONE, 14(9), e0222062. 10.1371/journal.pone.0222062
    https://doi.org/10.1371/journal.pone.0222062 [Google Scholar]
  23. Egbert, J., & Staples, S.
    (2019) Doing multi-dimensional analysis in SPSS, SAS, and R. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis: Research methods and current issues (pp.125–144). London / New York: Bloomsbury Academic. 10.5040/9781350023857.0015
    https://doi.org/10.5040/9781350023857.0015 [Google Scholar]
  24. Fairchild, C.
    (2007) Building the authentic celebrity: The ‘idol’ phenomenon in the attention economy. Popular Music and Society, 30(3), 355–375. 10.1080/03007760600835306
    https://doi.org/10.1080/03007760600835306 [Google Scholar]
  25. Friginal, E., & Hardy, J. A.
    (2014) Conducting multi-dimensional analysis using SPSS. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber (pp.298–316). Amsterdam/Philadelphia: John Benjamins. 10.1075/scl.60.10fri
    https://doi.org/10.1075/scl.60.10fri [Google Scholar]
  26. Friginal, E., & Hardy, J.
    (2019) From factors to dimensions: Interpreting linguistic co-occurrence patterns. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis: Research methods and current issues (pp.145–164). London: Bloomsbury Academic. 10.5040/9781350023857.0016
    https://doi.org/10.5040/9781350023857.0016 [Google Scholar]
  27. Friginal, E., Waugh, O., & Titak, A.
    (2018) Linguistic variation in Facebook and Twitter posts. InE. Friginal & J. A. Hardy (Eds.), Studies in corpus-based sociolinguistics (pp.342–362). London: Routledge.
    [Google Scholar]
  28. Goulart, L., & Wood, M.
    (2019) Methodological synthesis of research using multi-dimensional analysis. Journal of Research Design and Statistics in Linguistics and Communication Science, 6(2), 107–137.
    [Google Scholar]
  29. Gray, B.
    (2019) Tagging and counting linguistic features for multi-dimensional analysis. InT. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional analysis: Research methods and current issues (pp.43–66). London / New York: Bloomsbury Academic. 10.5040/9781350023857.0011
    https://doi.org/10.5040/9781350023857.0011 [Google Scholar]
  30. Holgado-Tello, F. P., Chacon-Moscoso, S., Barbero-Garcia, I., & Vila-Abad, E.
    (2010) Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, 441, 153–166. 10.1007/s11135‑008‑9190‑y
    https://doi.org/10.1007/s11135-008-9190-y [Google Scholar]
  31. Liimatta, A.
    (2019) Exploring register variation on Reddit: A multi-dimensional study of language use on a social media website. Register Studies, 1(2), 269–295. 10.1075/rs.18005.lii
    https://doi.org/10.1075/rs.18005.lii [Google Scholar]
  32. Longacre, R. E.
    (1983) The grammar of discourse. New York: Plenum Press.
    [Google Scholar]
  33. Marwick, A.
    (2015) Instafame: Luxury selfies in the attention economy. Public Culture, 27(1), 137–160. 10.1215/08992363‑2798379
    https://doi.org/10.1215/08992363-2798379 [Google Scholar]
  34. McCulloch, G.
    (2019) Because Internet: Understanding the new rules of language. New York: Riverhead Books.
    [Google Scholar]
  35. O’Halloran, K.
    (2022) Posthumanism and corpus linguistics. InA. O’Keeffe & M. McCarthy (Eds.), The Routledge handbook of corpus linguistics (pp.675–692). New York: Routledge. 10.4324/9780367076399‑47
    https://doi.org/10.4324/9780367076399-47 [Google Scholar]
  36. Prina Dutra, D., & Berber Sardinha, T.
    (2018) A linguistic typology of sections in research articles: A multi-dimensional perspective. Paper presented at theArizona Corpus Linguistics Conference (AZCL), Flagstaff, AZ, USA.
    [Google Scholar]
  37. Rüdiger, S., & Dayter, D.
    (2020) The expanding landscape of corpus-based studies of social media language. InS. Rüdiger & D. Dayter (Eds.), Corpus approaches to social media (Vol. 98, Studies in Corpus Linguistics, pp. 1–13). Amsterdam, New York: John Benjamins. 10.1075/scl.98.int
    https://doi.org/10.1075/scl.98.int [Google Scholar]
  38. Sarle, W. S.
    (1983) Cubic clustering criterion. Cary: SAS Institute Inc.
    [Google Scholar]
  39. Shulman, D.
    (2017) The presentation of self in contemporary social life. Los Angeles: Sage. 10.4135/9781506340913
    https://doi.org/10.4135/9781506340913 [Google Scholar]
  40. Tannen, D.
    (1982) Oral and literate strategies in spoken and written narratives. Language, 58(1), 1–21. 10.2307/413530
    https://doi.org/10.2307/413530 [Google Scholar]
  41. Titak, A., & Roberson, A.
    (2013) Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora, 8(2), 235–260. 10.3366/cor.2013.0042
    https://doi.org/10.3366/cor.2013.0042 [Google Scholar]
  42. van der Goot, R.
    (2019) MoNoise: A multilingual and easy-to-use lexical normalization tool. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. ACL: Florence, pp.201–206. 10.18653/v1/P19‑3032
    https://doi.org/10.18653/v1/P19-3032 [Google Scholar]
  43. Werlich, E.
    (1983) A text grammar of English. Heidelberg: Quelle & Meyer.
    [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): multi-dimensional analysis; social media; text typology; variation
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error