Volume 28, Issue 3
  • ISSN 1384-6655
  • E-ISSN: 1569-9811
Buy:$35.00 + Taxes



This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by a lack of annotation expertise. By examining annotation uncertainty in more detail, we identify the sources, deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice, and discuss practical implications of our theoretical findings. This paper can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.


Article metrics loading...

Loading full text...

Full text loading...


  1. Aarts, B.
    (2007) Syntactic Gradience: The Nature of Grammatical Indeterminacy. Cambridge University Press.
    [Google Scholar]
  2. Russell, B.
    (1923) Vagueness. The Australasian Journal of Psychology and Philosophy, 11, 84–92. 10.1080/00048402308540623
    https://doi.org/10.1080/00048402308540623 [Google Scholar]
  3. Bley-Vroman, R.
    (1983) The comparative fallacy in interlanguage studies: The case of systematicity. Language learning, 331, 1–17. 10.1111/j.1467‑1770.1983.tb00983.x
    https://doi.org/10.1111/j.1467-1770.1983.tb00983.x [Google Scholar]
  4. Bybee, J. L.
    (2010) Language, Usage and Cognition. Cambridge University Press. 10.1017/CBO9780511750526
    https://doi.org/10.1017/CBO9780511750526 [Google Scholar]
  5. (2011) Usage-based theory and grammaticalization. InH. Narrog & B. Heine (Eds.), The Oxford Handbook of Grammaticalization (pp.60–78). Oxford University Press. 10.1093/oxfordhb/9780199586783.013.0006
    https://doi.org/10.1093/oxfordhb/9780199586783.013.0006 [Google Scholar]
  6. Croft, W. A.
    (2001) Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford University Press. 10.1093/acprof:oso/9780198299554.001.0001
    https://doi.org/10.1093/acprof:oso/9780198299554.001.0001 [Google Scholar]
  7. Deng, Y.
    (2014) Generalized evidence theory. CoRR, abs/1404.4801.v1. Retrieved fromhttps://arxiv.org/abs/1404.4801
  8. Denison, D.
    (2017) Ambiguity and vagueness in historical change. InM. Hundt, S. Molling, & S. E. Pfenniger (Eds.), The Changing English Language: Psycholinguistic Perspectives (pp.292–318). Cambridge University Press. 10.1017/9781316091746.013
    https://doi.org/10.1017/9781316091746.013 [Google Scholar]
  9. Diewald, G.
    (2009) Konstruktionen und Paradigmen [Constructions and paradigms]. Zeitschrift für germanistische Linguistik, 37(3), 445–468. 10.1515/ZGL.2009.031
    https://doi.org/10.1515/ZGL.2009.031 [Google Scholar]
  10. Dipper, S.
    (2015) Annotierte Korpora für die Historische Syntaxforschung: Anwendungsbeispiele anhand des Referenzkorpus Mittelniederdeutsch [Annotated corpora for historical syntax studies: Applications of the Middle Low German Reference Corpus]. Zeitschrift für Germanistische Linguistik, 43(3), 516–563. 10.1515/zgl‑2015‑0020
    https://doi.org/10.1515/zgl-2015-0020 [Google Scholar]
  11. Dipper, S., Donhauser, K., Klein, T., Linde, S., Müller, S., & Wegera, K.-P.
    (2013) HiTS: ein Tagset für historische Sprachstufen des Deutschen [HiTS: A tagset for historical language levels of German]. Journal for Language Technology and Computational Linguistics, 281, 85–137. 10.21248/jlcl.28.2013.170
    https://doi.org/10.21248/jlcl.28.2013.170 [Google Scholar]
  12. Dubois, D.
    (2006) Possibility theory and statistical reasoning. Computational Statistics and Data Analysis, 51(1), 47–69. 10.1016/j.csda.2006.04.015
    https://doi.org/10.1016/j.csda.2006.04.015 [Google Scholar]
  13. Dubois, D., & Prade, H.
    (1988) Possibility Theory. Plenum Press.
    [Google Scholar]
  14. (1990) Rough fuzzy sets and fuzzy rough sets. International Journal of General Systems, 171, 191–209. 10.1080/03081079008935107
    https://doi.org/10.1080/03081079008935107 [Google Scholar]
  15. Dubois, D., Prade, H., & Smets, P.
    (1996) Representing partial ignorance. IEEE Transactions on Systems, Man and Cybernetics, Series A, 26(3), 361–377. 10.1109/3468.487961
    https://doi.org/10.1109/3468.487961 [Google Scholar]
  16. Eckart de Castilho, R., Mújdricza-Maydt, E., Yimam, S. M., Hartmann, S., Gurevych, I., Frank, A., & Biemann, C.
    (2016) A web-based tool for the integrated annotation of semantic and syntactic structures. InProceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) (pp.76–84). The COLING 2016 Organizing Committee. https://www.aclweb.org/anthology/W16-4011
    [Google Scholar]
  17. Hacking, I.
    (1975) The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference. Cambridge University Press.
    [Google Scholar]
  18. Hajek, P.
    (1998) Metamathematics of Fuzzy Logic. Kluwer. 10.1007/978‑94‑011‑5300‑3
    https://doi.org/10.1007/978-94-011-5300-3 [Google Scholar]
  19. Heine, B.
    (2002) On the role of context in grammaticalization. InI. Wischer & G. Diewald (Eds.), New Reflections on Grammaticalization (pp.83–101). John Benjamins. 10.1075/tsl.49.08hei
    https://doi.org/10.1075/tsl.49.08hei [Google Scholar]
  20. Heine, B., & Narrog, H.
    (2010) Grammaticalization and linguistic analysis. InB. Heine & H. Narrog (Eds.), The Oxford Handbook of Linguistic Analysis (pp.401–423). Oxford University Press.
    [Google Scholar]
  21. Keynes, J. M.
    (1909) A treatise on probability. Diamond, 3(2).
    [Google Scholar]
  22. Klie, J.-C., Bugert, M., Boullosa, B., de Castilho, R. E., & Gurevych, I.
    (2018) The inception platform: Machine-assisted and knowledge-oriented interactive annotation. InProceedings of the 27th international conference on computational linguistics: System demonstrations (pp.5–9). Association for Computational Linguistics. tubiblio.ulb.tudarmstadt.de/106270/
    [Google Scholar]
  23. Krishnapuram, R.
    (1994) Generation of membership functions via possibilistic clustering. InProceedings of the IEEE 3rd International Fuzzy Systems Conference. 10.1109/FUZZY.1994.343851
    https://doi.org/10.1109/FUZZY.1994.343851 [Google Scholar]
  24. Kruse, R., Schwecke, E., & Heinsohn, J.
    (1991) Uncertainty and Vagueness in Knowledge Based Systems. Springer. 10.1007/978‑3‑642‑76702‑9
    https://doi.org/10.1007/978-3-642-76702-9 [Google Scholar]
  25. Kübler, S., & Zinsmeister, H.
    (2015) Corpus Linguistics and Linguistically Annotated Corpora. Bloomsbury.
    [Google Scholar]
  26. Lakoff, G.
    (1987) Cognitive models and prototype theory. InU. Neisser (Ed.), Concepts and Conceptual Development (pp.63–100). Cambridge University Press.
    [Google Scholar]
  27. Langacker, R. W.
    (1987) Foundations of Cognitive Grammar (i): Theoretical Prerequisites. Stanford University Press.
    [Google Scholar]
  28. Lehmberg, M.
    (2013) Der Goslarer Ratskodex – das Stadtrecht um 1350 [Codex of Goslar’s Council – the Municipal Law around 1350]. Verlag für Regionalgeschichte.
    [Google Scholar]
  29. Lientz, B. P.
    (1972) On time dependent fuzzy sets. Information Sciences, 4(3–4), 367–376. 10.1016/S0020‑0255(72)80022‑7
    https://doi.org/10.1016/S0020-0255(72)80022-7 [Google Scholar]
  30. Matheron, G.
    (1975) Random Sets and Integral Geometry. John Wiley and Sons.
    [Google Scholar]
  31. Merten, M.
    (2018) Literater Sprachausbau kognitiv-funktional [Literate language expansion cognitive-functional]. De Gruyter. 10.1515/9783110575002
    https://doi.org/10.1515/9783110575002 [Google Scholar]
  32. Merten, M. & Tophinke, D.
    (2019) Interaktive Analyse historischen Grammatikwandels. Konstruktionsgrammatik trifft auf machine learning [Interactive analysis of historical grammatical change: Construction grammar meets machine learning]. Jahrbuch für Germanistische Sprachgeschichte, 101, 303–323. 10.1515/jbgsg‑2019‑0017
    https://doi.org/10.1515/jbgsg-2019-0017 [Google Scholar]
  33. Narrog, H.
    (2012) Modality, Subjectivity, and Semantic Change: A Cross-linguistic Perspective. Oxford University Press. 10.1093/acprof:oso/9780199694372.001.0001
    https://doi.org/10.1093/acprof:oso/9780199694372.001.0001 [Google Scholar]
  34. Nguyen, H. T.
    (1978) On random sets and belief functions. Journal of Mathematical Analysis and Applications, 651, 531–542. 10.1016/0022‑247X(78)90161‑0
    https://doi.org/10.1016/0022-247X(78)90161-0 [Google Scholar]
  35. Pawlak, Z.
    (1982) Rough sets. International Journal of Computer and Information Sciences, 111, 341–356. 10.1007/BF01001956
    https://doi.org/10.1007/BF01001956 [Google Scholar]
  36. Schmid, H.-J.
    (2010) Does frequency in text instantiate entrenchment in the cognitive system?InD. Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches (pp.110–133). De Gruyter. 10.1515/9783110226423.101
    https://doi.org/10.1515/9783110226423.101 [Google Scholar]
  37. Seemann, N., Merten, M., Geierhos, M., Tophinke, D., & Hüllermeier, E.
    (2017) Annotation challenges for reconstructing the structural elaboration of Middle Low German. InB. Alex, S. Degaetano-Ortlieb, A. Feldman, A. Kazantseva, N. Reiter, & S. Szpakowicz (Eds.), Proceedings of Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, and Literature (pp.40–45). Association for Computational Linguistics. 10.18653/v1/W17‑2206
    https://doi.org/10.18653/v1/W17-2206 [Google Scholar]
  38. Shafer, G.
    (1976) A Mathematical Theory of Evidence. Princeton University Press. 10.1515/9780691214696
    https://doi.org/10.1515/9780691214696 [Google Scholar]
  39. Shilkret, N.
    (1971) Maxitive measure and integration. Nederlandse Akadademie van Wetenschappen. Proceedings Serie A 74 = Indagationes Mathematicae, 331, 109–116. 10.1016/S1385‑7258(71)80017‑3
    https://doi.org/10.1016/S1385-7258(71)80017-3 [Google Scholar]
  40. Skala, H.
    (1978) On many-valued logics, fuzzy sets, fuzzy logics and their applications. Fuzzy Sets and Systems, 1(2), 129–149. 10.1016/0165‑0114(78)90013‑1
    https://doi.org/10.1016/0165-0114(78)90013-1 [Google Scholar]
  41. Smets, P., & Kennes, R.
    (1994) The transferable belief model. Artificial Intelligence, 661, 191–234. 10.1016/0004‑3702(94)90026‑4
    https://doi.org/10.1016/0004-3702(94)90026-4 [Google Scholar]
  42. Taylor, J. R.
    (2003) Linguistic Categorization. Oxford University Press.
    [Google Scholar]
  43. Traugott, E., & Trousdale, G.
    (2010) Gradience, gradualness and grammaticalization: How do they intersect?InE. C. Traugott & G. Trousdale (Eds.), Gradience, Gradualness and Grammaticalization (pp.19–44). John Benjamins. 10.1075/tsl.90.04tra
    https://doi.org/10.1075/tsl.90.04tra [Google Scholar]
  44. (2013) Constructionalization and Constructional Changes. Oxford University Press. 10.1093/acprof:oso/9780199679898.001.0001
    https://doi.org/10.1093/acprof:oso/9780199679898.001.0001 [Google Scholar]
  45. Trousdale, G.
    (2012) Grammaticalization, lexicalization and constructionalization from a cognitive-pragmatic perspective. InH.-J. Schmid (Ed.), Cognitive Pragmatics (pp.533–558). De Gruyter. 10.1515/9783110214215.533
    https://doi.org/10.1515/9783110214215.533 [Google Scholar]
  46. (2013) Gradualness in language change. InA. G. Ramat, C. Mauri, & P. Molinelli (Eds.), Synchrony and Diachrony: A Dynamic Interface (pp.27–42). John Benjamins. 10.1075/slcs.133.02tro
    https://doi.org/10.1075/slcs.133.02tro [Google Scholar]
  47. Walley, P.
    (1991) Statistical Reasoning with Imprecise Probabilities. Chapman and Hall. 10.1007/978‑1‑4899‑3472‑7
    https://doi.org/10.1007/978-1-4899-3472-7 [Google Scholar]
  48. Zadeh, L. A.
    (1965) Fuzzy sets. Information and Control, 8(3), 338–353. 10.1016/S0019‑9958(65)90241‑X
    https://doi.org/10.1016/S0019-9958(65)90241-X [Google Scholar]

Data & Media loading...

  • Article Type: Research Article
Keyword(s): annotation; fuzziness; grammatical change; uncertainty
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error