
Full text loading...
Abstract
In this article, we present the results of a corpus-based study where we explore whether it is possible to automatically single out different facets of text complexity in a general-purpose corpus. To this end, we use factor analysis as applied in Biber’s multi-dimensional analysis framework. We evaluate the results of the factor solution by correlating factor scores and readability scores to ascertain whether the selected factor solution matches the independent measurement of readability, which is a notion tightly linked to text complexity. The corpus used in the study is the Swedish national corpus, called Stockholm-Umeå Corpus or SUC. The SUC contains subject-based text varieties (e.g., hobby), press genres (e.g., editorials), and mixed categories (e.g., miscellaneous). We refer to them collectively as ‘registers’. Results show that it is indeed possible to elicit and interpret facets of text complexity using factor analysis despite some caveats. We propose a tentative text complexity profiling of the SUC registers.
Article metrics loading...
Full text loading...
References
Data & Media loading...