The usefulness of corpus-based descriptions of English for learners: The case of relative frequency
The paper argues that some of the most interesting information about word frequency is difficult to present in a way that is useful to learners. This is particularly true when the issue is not one of absolute frequency but of frequency that is relative and dependent on complex variables. The first instance of this is phrases which are not frequent in themselves but which can be counted as phrases because of the significance of the unit to each of its components. In cases such as these the unity of a perceived multi-word unit depends on the strength of collocation between progressively longer sequences. The second instance explores the factors influencing the relative frequency of wordforms within a lemma. These include complementation pattern and modality. It is hypothesised that a relatively high frequency of the base wordform is likely to co-occur with a high frequency of modal-like expressions. The third and final instance is ‘semantic sequences’, which describe what is often said, though not what should be said. These emerge as a consequence of a combination of discoursal constraints, including the phraseology of a discipline. The paper raises questions concerning the relevance of such frequency information to language teaching and the difficulty of translating an approach to describing a corpus into a useful tool for language learners and teachers.