Volume 3, Issue 1
  • ISSN 2542-9477
  • E-ISSN: 2542-9485



This article introduces a new method for grouping keywords and examines the extent to which it also allows analysts to explore the interaction of discourse and subregister. It uses the multivariate statistical technique, Multiple Correspondence Analysis, to reveal dimensions of keywords which co-occur across the texts of a corpus. These dimensions are then interpreted in terms of the discourses to which they contribute within the data, thus forming the basis of a corpus-assisted discourse analysis. The approach is demonstrated through analysis of the discourses that are used to represent Muslims and Islam in a corpus of UK national newspaper articles published on these topics spanning 2010–2019. The approach reveals an interaction between discourse and subregister, hence this article argues for the need for (corpus-assisted) discourse analysts to account for subregister as a level of meaningful variation when analysing press discourse.

Available under the CC BY 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Baker, P., Gabrielatos, C., & McEnery, T.
    (2013) Discourse Analysis and Media Attitudes: The Representation of Islam in the British Press. Cambridge: Cambridge University Press. 10.1017/CBO9780511920103
    https://doi.org/10.1017/CBO9780511920103 [Google Scholar]
  2. Baker, P., & McEnery, T.
    (2019) The value of revisiting and extending previous studies: the case of Islam in the UK press. InR. Scholtz (Ed.), Quantifying Approaches to Discourse for Social Scientists (pp.215–249). Basingstoke: Palgrave Macmillan. 10.1007/978‑3‑319‑97370‑8_8
    https://doi.org/10.1007/978-3-319-97370-8_8 [Google Scholar]
  3. Bednarek, M.
    (2006) Evaluations in Media Discourse: Analysis of a Newspaper Corpus. London: Continuum.
    [Google Scholar]
  4. Biber, D.
    (1988) Variation across speech and writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  5. Biber, D., & Conrad, S.
    (2019) Register, Genre and Style (2nd edition). Cambridge: Cambridge University Press. 10.1017/9781108686136
    https://doi.org/10.1017/9781108686136 [Google Scholar]
  6. Biber, D., & Gray, B.
    (2013) Being specific about historical change: The influence of sub-register. Journal of English Linguistics, 41(2), 104–134. 10.1177/0075424212472509
    https://doi.org/10.1177/0075424212472509 [Google Scholar]
  7. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E.
    (1999) Longman Grammar of Spoken and Written English. Harlow: Longman.
    [Google Scholar]
  8. Benzécri, J. P.
    (1979) Sur le calcul des taux d’inertie dans l’analyse d’un questionnaire. Cahiers de l’Analyse des Données, 4, 377–378.
    [Google Scholar]
  9. Brookes, G., & McEnery, T.
    (2019) The utility of topic modelling for discourse studies: a critical evaluation, Discourse Studies, 21(1), 3–21. 10.1177/1461445618814032
    https://doi.org/10.1177/1461445618814032 [Google Scholar]
  10. Carter, R.
    (1988) The language of written sports commentary: soccer – a description. InM. Ghadessy (Ed.), Registers of Written English. Situational Factors and Linguistic Features (pp.16–51). London: Frances Pinter, London.
    [Google Scholar]
  11. Clarke, I.
    (2019) Functional linguistic variation in Twitter trolling. International Journal of Speech Language and the Law, 26(1), 57–84. 10.1558/ijsll.34803
    https://doi.org/10.1558/ijsll.34803 [Google Scholar]
  12. Clarke, I., Brookes, G. & McEnery, T.
    (Forthcoming). Keywords through time: Tracking changes in press discourses of Islam. International Journal of Corpus Linguistics.
    [Google Scholar]
  13. Dunning, T.
    (1993) Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
    [Google Scholar]
  14. Lê, S., Josse, J., & Husson, F.
    (2008) FactoMineR: An R Package for Multivariate Analysis. Journal of Statistical Software, 25(1), 1–18. 10.18637/jss.v025.i01
    https://doi.org/10.18637/jss.v025.i01 [Google Scholar]
  15. Le Roux, B., & Rouanet, H.
    (2010) Multiple Correspondence Analysis. London: Sage. 10.4135/9781412993906
    https://doi.org/10.4135/9781412993906 [Google Scholar]
  16. Partington, A.
    (2014) Mind the gaps: The role of corpus linguistics in researching absences. International Journal of Corpus Linguistics, 19(1), 118–146. 10.1075/ijcl.19.1.05par
    https://doi.org/10.1075/ijcl.19.1.05par [Google Scholar]
  17. Pinna, A., & Brett, D.
    (2018) Constance and variability: Using PoS-grams to find phraseologies in the language of newspapers. InJ. Kopaczyk & J. Tyrkkö (Eds.), Applications of Pattern-driven Methods in Corpus Linguistics (pp.107–130). Amsterdam: John Benjamins. 10.1075/scl.82.05pin
    https://doi.org/10.1075/scl.82.05pin [Google Scholar]
  18. Richardson, J. E.
    (2004) (Mis)Representing Islam: The Racism and Rhetoric of British Broadsheet Newspapers. Amsterdam: John Benjamins. 10.1075/dapsac.9
    https://doi.org/10.1075/dapsac.9 [Google Scholar]
  19. Schroeter, M. & Taylor, C.
    (2018) Exploring Silence and Absence in Discourse: Empirical Approaches. London: Palgrave Macmillan. 10.1007/978‑3‑319‑64580‑3
    https://doi.org/10.1007/978-3-319-64580-3 [Google Scholar]
  20. Zou, H., Hastie, T. & Tibshirani, R.
    (2006) Sparse Principal Component Analysis. Journal of Computational and Graphical Statistics, 15(2), 265–286. 10.1198/106186006X113430
    https://doi.org/10.1198/106186006X113430 [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): Islam; keyword analysis; Multiple Correspondence Analysis; newspaper discourse
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error