
Full text loading...
This paper draws attention to the complexity of problems arising in statistical linguistics when it must compare various corpora. Those problems are discussed from the point of view of distributional statistical analysis of texts; that is, a set of formal procedures with a minimum of preconceived linguistic knowledge. The terminological distinction between contrastive and comparable corpora is introduced.