Volume 8, Issue 1
  • ISSN 2211-3711
  • E-ISSN: 2211-372X



Exploring questions of representativeness, balance and comparability is essential to tailoring corpus design and compilation to research goals, and to ensuring the validity of research results. This is especially true when the target population of texts under examination is very large and transcends a restricted area of specialization and/or covers multiple genres, as in the case of texts translated in institutional settings. This paper describes the multilayered sequential approach to corpus building applied in a comparative study on legal translation in three of these settings. The approach is based on a full mapping and categorization of institutional texts from a legal perspective; it applies an innovative combination of stratified sampling techniques integrating quantitative and qualitative criteria adapted to the research aims. The resulting corpora, categorization matrix and selection records, together with the methodological detail provided, can be useful for building other multi-genre corpora in translation studies and further afield.

This work is licensed under a Creative Commons Attribution 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Aston, Guy
    1999 “Corpus Use and Learning to Translate.” Textus12: 289–314.
    [Google Scholar]
  2. Atkins, Sue , Jeremy Clear , and Nicholas Ostler
    1992 “Corpus Design Criteria.” Literary and Linguistic Computing7 (1): 1–16. 10.1093/llc/7.1.1
    https://doi.org/10.1093/llc/7.1.1 [Google Scholar]
  3. Biber, Douglas
    1988Variation Across Speech and Writing. Cambridge: Cambridge University Press. 10.1017/CBO9780511621024
    https://doi.org/10.1017/CBO9780511621024 [Google Scholar]
  4. 1990 “Methodological Issues Regarding Corpus-based Analyses of Linguistic Variation.” Literary and Linguistic Computing5: 257–269. 10.1093/llc/5.4.257
    https://doi.org/10.1093/llc/5.4.257 [Google Scholar]
  5. 1993 “Representativeness in Corpus Design.” Literary and Linguistic Computing8 (4): 243–257. 10.1093/llc/8.4.243
    https://doi.org/10.1093/llc/8.4.243 [Google Scholar]
  6. Bowker, Lynne , and Jennifer Pearson
    2002Working with Specialized Language: A Practical Guide to Using Corpora. London and New York: Routledge. 10.4324/9780203469255
    https://doi.org/10.4324/9780203469255 [Google Scholar]
  7. Cerutti, Giorgina
    2017 “Evaluating Tools for Legal Translation Research Needs: The Case of Fourth-generation Concordancers.” Legal Translation and Court Interpreting: Ethical Values, Quality, Competence Training, edited by Annikki Liimatainen , Arja Nurmi , Marja Kivilehto , Leena Salmi , Anu Viljanmaa , and Melissa Wallace , 357–391. Berlin: Frank & Timme.
    [Google Scholar]
  8. Claridge, Claudia
    2008 “Historical Corpora.” Corpus Linguistics, edited by Anke Lüdeling , and Merja Kytö , 242–259. Berlin: Mouton de Gruyter.
    [Google Scholar]
  9. Corpas Pastor, Gloria , and Miriam Seghiri Domínguez
    2007 “Determinación del umbral de representatividad de un corpus mediante el algoritmo N-Cor [Establishing a corpus representativeness threshold through the N-Cor algorithm].” Procesamiento del lenguaje natural39: 165–172.
    [Google Scholar]
  10. European Commission
    European Commission 2014 “Theme: Sample Selection–Main Module.” Memobust Handbook on Methodology of Modern Business Statistics. Brussels: European Commission. AccessedDecember 18, 2018. https://ec.europa.eu/eurostat/cros/system/files/Sample%20Selection-01-T-Main%20Module%20v1.0_1.pdf
    [Google Scholar]
  11. Felici, Annarita
    2015 “Translating EU Legislation from a ‘Lingua Franca’: Advantages and Disadvantages.” Language and Culture in EU Law: Multidisciplinary Perspectives, edited by Susan Šarčević , 123–140. Farnham: Ashgate.
    [Google Scholar]
  12. Halverson, Sandra
    1998 “Translation Studies and Representative Corpora: Establishing Links between Translation Corpora, Theoretical/Descriptive Categories and a Conception of the Object of Study.” Meta: Translators’ Journal43 (4): 494–514. 10.7202/003000ar
    https://doi.org/10.7202/003000ar [Google Scholar]
  13. Husa, Jaakko
    2012 “Understanding Legal Languages-Linguistic Concerns of the Comparative Lawyer.” The role of legal translation in legal harmonization, edited by Cornelis J. W. Baaij , 161–181. The Hague: Kluwer Law International.
    [Google Scholar]
  14. Koester, Almut
    2010 “Building Small Specialised Corpora.” The Routledge Handbook of Corpus Linguistics, edited by Michael McCarthy , and Anne O’Keeffe , 66–79. Abingdon: Routledge. 10.4324/9780203856949.ch6
    https://doi.org/10.4324/9780203856949.ch6 [Google Scholar]
  15. Leech, Geoffrey
    1991 “The State of the Art in Corpus Linguistics.” English Corpus Linguistics: Studies in Honour of Jan Svartvik, edited by Karin Aijmer , and Bengt Altenberg , 8–29. London: Longman.
    [Google Scholar]
  16. 2007 “New Resources, or Just Better Old Ones? The Holy Grail of Representativeness.” Corpus Linguistics and the Web, edited by Marianne Hundt , Nadja Nesselhauf , and Carolin Biewer , 133–149. Amsterdam: Rodopi. 10.1163/9789401203791_009
    https://doi.org/10.1163/9789401203791_009 [Google Scholar]
  17. McEnery, Tony , and Andrew Hardie
    2012Corpus Linguistics: Method, Theory and Practice. Cambridge and New York: Cambridge University Press.
    [Google Scholar]
  18. McEnery, Tony , and Anita Wilson
    2001Corpus Linguistics: An Introduction. Edinburgh: Edinburgh University Press.
    [Google Scholar]
  19. McEnery, Tony , Richard Xiao , and Yukio Tono
    2006Corpus-based Language Studies: An Advanced Resource Book. London and New York: Routledge.
    [Google Scholar]
  20. Mellinger, Christopher D. , and Thomas A. Hanson
    2017Quantitative Research Methods in Translation and Interpreting Studies. London and New York: Routledge.
    [Google Scholar]
  21. Mori, Laura
    (ed) 2018Observing Eurolects. Corpus Analysis of Linguistic Variation in EU Law, Studies in Corpus Linguistics. Amsterdam and Philadelphia: Benjamins Publishing Company. 10.1075/scl.86
    https://doi.org/10.1075/scl.86 [Google Scholar]
  22. Oostdijk, Nelleke
    1991Corpus Linguistics and the Automatic Analysis of English. Amsterdam and Atlanta: Rodopi.
    [Google Scholar]
  23. Prieto Ramos, Fernando
    2004Media and Migrants: A Critical Analysis of Spanish and Irish Discourses on Immigration. Oxford, Bern and New York: Peter Lang.
    [Google Scholar]
  24. 2014 “International and Supranational Law in Translation: From Multilingual Lawmaking to Adjudication.” The Translator20 (3): 313–331. 10.1080/13556509.2014.904080
    https://doi.org/10.1080/13556509.2014.904080 [Google Scholar]
  25. 2017 “Global Law as Translated Text: Mapping Institutional Legal Translation.” Tilburg Law Review22 (1–2): 185–214. 10.1163/22112596‑02201009
    https://doi.org/10.1163/22112596-02201009 [Google Scholar]
  26. 2019 “Implications of Text Categorisation for Corpus-based Legal Translation Research: The Case of International Institutional Settings.” Research Methods in Legal Translation and Interpreting: Crossing Methodological Boundaries, edited by Łucja Biel , Jan Engberg , Rosario Martín Ruano , and Vilelmini Sosoni , 29–47. London and New York: Routledge.
    [Google Scholar]
  27. Prieto Ramos, Fernando , and Diego Guzmán
    2018 “Legal Terminology Consistency and Adequacy as Quality Indicators in Institutional Translation: A Mixed-Method Comparative Study.” Institutional Translation for International Governance: Enhancing Quality in Multilingual Legal Communication, edited by Fernando Prieto Ramos , 81–101. London: Bloomsbury.
    [Google Scholar]
  28. Scott, Michael
    2012WordSmith Tools. Version 6. Stroud: Lexical Analysis Software.
    [Google Scholar]
  29. Sinclair, John
    2004Trust the Text: Language, Corpus and Discourse. London: Routledge.
    [Google Scholar]
  30. 2005 “Corpus and Text–Basic Principles.” Developing Linguistic Corpora: A Guide to Good Practice, edited by Martin Wynne , 1–6. Oxford: Oxbow Books.
    [Google Scholar]
  31. Steinberg, Richard H.
    2004 “Judicial Lawmaking at the WTO: Discursive, Constitutional, and Political Constraints.” American Journal of International Law98: 247–275. 10.2307/3176728
    https://doi.org/10.2307/3176728 [Google Scholar]
  32. Trklja, Aleksandar , and Karen McAuliffe
    2018 “The European Union Case Law Corpus (EUCLCORP): A Multilingual Parallel and Comparative Corpus of EU Court Judgments (March 5, 2018).” Proceedings of the Second Workshop on Corpus-Based Research in the Humanities: CRH-2, edited by Andrew U. Frank , Christine Ivanovic , Francesco Mambrini , Marco Passarotti , and Caroline Sporleder , 217–226. Vienna: Gerastree Proceedings.
    [Google Scholar]
  33. van Els, Theo
    2001 “The European Union, its Institutions and its Languages: Some Language Political Observations.” Current Issues in Language Planning2 (4): 311–360. 10.1080/14664200108668030
    https://doi.org/10.1080/14664200108668030 [Google Scholar]
  34. Varantola, Krista
    2000 “Translators, Dictionaries and Text Corpora.” I corpora nella didattica della traduzione, edited by Silvia Bernardini , and Federico Zanettin , 117–133. Bologna: CLUEB.
    [Google Scholar]
  35. Walter, Elizabeth
    2010 “Using Corpora to Write Dictionaries.” The Routledge Handbook of Corpus Linguistics, edited by Michael McCarthy , and Anne O’Keeffe , 428–443. Abingdon: Routledge. 10.4324/9780203856949.ch31
    https://doi.org/10.4324/9780203856949.ch31 [Google Scholar]
  36. Zanettin, Federico
    2012Translation-Driven Corpora. Corpus Resources for Descriptive and Applied Translation Studies. Manchester: St. Jerome Publishing.
    [Google Scholar]
  37. Zhao, Xingmin , and Deborah Cao
    2013 “Legal Translation at the United Nations.” Legal Translation in Context: Professional Issues and Prospects, edited by Anabel Borja Albi , and Fernando Prieto Ramos , 203–220. Frankfurt am Main: Peter Lang.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error