An automatic part-of-speech tagger for Middle Low German

MyBook is a cheap paperback edition of the original book and will be sold at uniform, low price.

Buy this article

Price: £15.00+Taxes
Add to favourites

The full text of this article is not currently available.

Data & Media loading...


Full text loading...


Baron, A. , & Rayson, P.
(2008, August). VARD2: A tool for dealing with spelling variation in historical corpora. Paper presented atPostgraduate Conference in Corpus Linguistics, Aston University, Birmingham, UK.
Barteld, F. , Schröder, I. , & Zinsmeister, H.
(2015) Unsupervised regularisation of historical texts for POS tagging. In F. Mambrini , M. Passarotti & C. Sporleder (Eds.), Proceedings of the Workshop on Corpus-Based Research in the Humanities (CRH) (pp.3–12). Polish Academy of Sciences: Institute of Computer Science.
Bennett, P. , Durrell, M. , Scheible, S. , & Whitt, R. J.
(2010) Annotating a historical corpus of German: A case study. InProceedings of the LREC 2010 workshop on Language Resources and Language Technology Standards (pp.64–68). European Language Resources Association.
Biber, D. , Conrad, S. , & Reppen, R.
(1998) Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511804489
Biebersteadt, A.
(2015) Variablenlinguistische Beobachtungen zu den mittelniederdeutschen Schreibsprachen des südlichen Ostseeraumes: Wismar und Stralsund als Beispiele. In H. U. Schmid & A. Ziegler (Eds.), 2015: Jahrbuch für Germanistische Sprachgeschichte. Bd. 6: Deutsch im Norden (pp.88–115). Berlin/New York: De Gruyter.
Bollmann, M. , Petran, F. , Dipper, S. , & Krasselt, J.
(2014) CorA: A web-based annotation tool for historical and other non-standard language data. InProceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) (pp.86–90). doi: 10.3115/v1/W14‑0612
Braunmüller, K.
(1996) Forms of language contact in the area of the Hanseatic League: Dialect contact phenomena and semicommunication. Nordic Journal of Linguistics, 19(2), 141–154. doi: 10.1017/S033258650000336X
(2002) Language contact during the Old Nordic period I: With the British Isles, Frisia and the Hanseatic League. In O. Bandle , K. Braunmüller , E. H. Jahr , A. Karker , H.-P. Naumann & U. Teleman (Eds.), The Nordic Languages: An International Handbook of the History of the Nordic Germanic Languages, Volume1 (pp.1028–1039). Berlin/New York: De Gruyter.
Breitbarth, A. , Walkden, G. , & Watts, S.
(2011 April). A Corpus for Middle Low German. Paper presented atNew Methods in Historical Corpora, Manchester, UK.
(2012 April). Building a corpus for Middle Low German: Notes and queries. Paper presented at theForum for Germanic Language Studies (FGLS10), Sheffield, UK.
Britto, H. , Finger, M. , & Galves, C.
(2002) Computational and linguistic aspects of the construction of The Tycho Brahe Parsed Corpus of Historical Portuguese. Romanistische Korpuslinguistik, Korpora und gesprochene Sprache, Romance Corpus Linguistics, Corpora and Spoken Language, ScriptOralia, 126.
Daelemans, W. , Van den Bosch, A. , & Zavrel, J.
(1999) Forgetting examples is harmful in language learning. Machine Learning, 34(1–3), 11–43. doi: 10.1023/A:1007585615670
Daelemans, W. , & Van den Bosch, A.
(2005) Memory-based Language Processing. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511486579
De Clercq, O.
(2015) Tipping the scales: exploring the added value of deep semantic processing on readability prediction and sentiment analysis (Unpublished doctoral dissertation). Ghent University, Ghent, Belgium.
Desmet, B. , Hoste, V. , Verstraeten, D. , & Verhasselt, J.
(2013) Gallop Documentation, (LT3 Technical Report - LT3 13.03).
Desmet, B.
(2014) Finding the online cry for help: Automatic text classification for suicide prevention (Unpublished doctoral dissertation). Ghent University, Ghent, Belgium.
Diel, M. , Fisseni, B. , Lenders, W. , & Schmitz, H.-C.
(2002) XML-Kodierung des Bonner Frühneuhochdeutschkorpus. Bonn: IKP-Arbeitsbericht NF 02.
Dipper, S.
(2015) Annotierte Korpora für die Historische Syntaxforschung: Anwendungsbeispiele anhand des Referenzkorpus Mittelhochdeutsch. Zeitschrift für Germanistische Linguistik, 43(3), 516–563. doi: 10.1515/zgl‑2015‑0020
Dipper, S. , Donhauser, K. , Klein, T. , Linde, S. , Müller, S. , & Wegera, K. P.
(2013) HiTS: ein Tagset für historische Sprachstufen des Deutschen. Journal for Language Technology and Computational Linguistics, 28(1), 85–137.
Fisseni, B. , Schmitz, H.-C. , & Schröder, B.
(2007) FnhdC/HTML und FnhdC/S. Sprache und Datenverarbeitung, 1–2/2007, 67–69.
Geyken, A. , Haaf, S. , Jurish, B. , Schulz, M. , Steinmann, J. , Thomas, C. , & Wiegand, F.
(2011) Das Deutsche Textarchiv: Vom historischen Korpus zum aktiven Archiv. InDigitale Wissenschaft. Stand und Entwicklung digital vernetzter Forschung in Deutschland, 20/21, September 2010, Beiträge der Tagung, 2., ergänzte Fassung (pp.157–161).
Kroch, A. , Taylor, A. , & Ringe, D.
(2000) The Middle English verb-second constraint: A case study in language contact and language change. In S. Herring , P. van Reenen & L. Schøsler (Eds.), Textual Parameters in Older Languages (pp.353–392). Amsterdam/Philadelphia: Benjamins.
Lafferty, J. , McCallum, A. , & Pereira, F.
(2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. InProceedings of the 18th International Conference on Machine Learning (pp.282–289). San Francisco, CA: Morgan Kaufmann.
Linde, S. , & Mittmann, R.
(2013) Old German reference corpus: Digitizing the knowledge of the 19th century. In P. Bennett , M. Durrell , S. Scheible , R. J. Whitt (Eds.), New Methods in Historical Corpora (pp.235–246). Tübingen: Narr Verlag.
Marcus, M. P. , Santorini B. , & Marcinkiewicz, M. A.
(1993) Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.
Martineau, F.
(2005) Modéliser le changement: Les voies du français/Modelling change: The paths of French. Ottawa: University of Ottawa. Retrieved (last accessedMarch 2017).
Moon, T. , & Baldridge, J.
(2007) Part-of-speech tagging for Middle English through alignment and projection of parallel diachronic texts. InProceedings of EMNLP/CONLL-2007 (pp.390–399).
Peters, R.
(1973) Mittelniederdeutsche Sprache. In J. Goossens (Ed.), Niederdeutsch – Sprache und Literatur. Bd. 1: Sprache (pp.66–115). Neumünster: Wachholtz.
(2003) Variation und Ausgleich in den mittelniederdeutschen Schreibsprachen. In M. Goyens & W. Verbeke (Eds.), The Dawn of the Written Vernacular in Western Europe (pp.427–440). Leuven: Leuven University Press.
Peters, R. , & Fischer, C.
(2007) Der ‘Atlas spätmittelalterlicher Schreibsprachen des niederdeutschen Altlandes und angrenzender Gebiete’. In L. Czajkowski , C. Hoffmann , H. U. Schmid (Eds.), Ostmitteldeutsche Schreibsprachen im Spätmittelalter (pp.23–33). Berlin: De Gruyter. doi: 10.1515/9783110958188.23
Peters, R. , & Nagel, N.
(2014) Das digitale ‘Referenzkorpus Mittelniederdeutsch/Niederrheinisch (ReN)’. Jahrbuch für Germanistische Sprachgeschichte, 5(1), 165–175. Berlin/Boston: de Gruyter.
Pettersson, E. , Megyesi, B. , & Nivre, J.
(2013) Normalisation of historical text using context-sensitive weighted Levenhstein distance and compound splitting. InProceedings of the 19th Nordic Conference on Computational Linguistics (NoDaLiDa 2013) (pp.163–179). Linköping: Linköping Electronic Conference Proceedings 85.
(2014) A multilingual evaluation of three spelling normalization methods for historical text. InProceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences and Humanities (LaTeCH 2014) (pp.32–41). Gothenburg: Association for Computational Linguistics. doi: 10.3115/v1/W14‑0605
Rayson, P. , Archer, D. , Baron, A. , Culpeper, J. , & Smith, N.
(2007) Tagging the bard: Evaluating the accuracy of a modern POS tagger on early modern English corpora. InProceedings of Corpus Linguistics 2007. Birmingham: University of Birmingham, UK.
Rögnvaldsson, E. , & Helgadóttir, S.
(2011) Morphosyntactic tagging of Old Icelandic texts and its use in studying syntactic variation and change. In C. Sporleder , A. van den Bosch , K. Zervanou (Eds.), Language Technology for Cultural Heritage: Selected Papers from the LaTeCH Workshop Series (pp.63–76). Berlin: Springer. doi: 10.1007/978‑3‑642‑20227‑8_4
Sanders, W.
(1982) Sprachgeschichtliche Grundzüge des Niederdeutschen. Vandenhoeck + Ruprecht Gm.
Scheible, S. , Whitt, R. J. , Durrell, M. , & Bennett, P.
(2011a) A gold standard corpus of Early Modern German. InProceedings of the 5th Linguistic Annotation Workshop (LAW V 2011) (pp.124–128). Association for Computational Linguistics.
(2011b) Evaluating an ‘off-the-shelf’ POS-tagger on early modern German text. InProceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2011), pp.19–23. Portland, OR: Association for Computational Linguistics.
Schiller, A. , Teufel, S. , & Thielen, C.
(1995) Guidelines für das Tagging deutscher Textkorpora mit STTS. Technical report, Universities of Stuttgart and Tübingen, 66. Retrieved (last accessedMarch 2017).
Schmid, H. , & Laws, F.
(2008) Estimation of conditional probabilities with decision trees and an application to fine-grained POS tagging. Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008) - Volume 1 (pp.777–784). Manchester: Association for Computational Linguistics. doi: 10.3115/1599081.1599179
Schneider, G. , Lehman, H. M. , & Schneider, P.
(2015) Parsing early and late modern English corpora. Literary and Linguistic Computing, 30(3), 423–439.
Schröder, I.
(2014) Neue Perspektiven für die mittelniederdeutsche Grammatikographie. Jahrbuch für germanistische Sprachgeschichte, 5(1), 150–164. doi: 10.1515/jbgsg‑2014‑0011
Schulz, S. , De Pauw, G. De Clercq, O. , Desmet, B. , Hoste, V. , Daelemans, W. , & Macken, L.
(2016) Multimodular Text Normalization of Dutch User-Generated Content. ACM Transactions on Intelligent Systems and Technology (TIST), 7(4), 1–22. doi: 10.1145/2850422
Silfverberg, M. , Ruokolainen, B. , Lindén, K. , & Kurimo, M.
(2014) Part-of-speech tagging using conditional random fields: Exploiting sub-label dependencies for improved accuracy. InProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (pp.259–264). Baltimore, MD.
Sukhareva, M. , & Chiarcos, C.
(2016) Combining ontologies and neural networks for analyzing historical language varieties: A case study in Middle Low German. In N. Calzolari , K. Choukri , T. Declerck , M. Grobelnik , B. Maegaard , J. Mariani , A. Moreno , J. Odijk & Stelios Piperidis (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Paris: European Language Resources Association (ELRA). Retrieved (last accessedMarch 2017).
Tophinke, D.
(2009) Vom Vorlesetext zum Lesetext: Zur Syntax mittelniederdeutscher Rechtsverordnungen im Spätmittelalter. In A. Linke , & H. Feilke (Eds.), Oberfläche und Performanz. Untersuchungen zur Sprache als dynamische Gestalt (pp.161–186). Tübingen: Niemeyer. doi: 10.1515/9783484971240.2.161
(2012) Syntaktischer Ausbau im Mittelniederdeutschen. Theoretisch-methodische Überlegungen und kursorische Analysen. Niederdeutsches Wort, 52, 19–46.
Tophinke, D. , & Wallmeier, N.
(2011) Textverdichtungsprozesse im Spämittelalter: Syntaktischer Wandel in mittelniederdeutschen Rechtstexten des 13.–16. Jahrhunderts. In S. Elspaß & M. Negele (Eds.) Sprachvariation und Sprachwandel in der Stadt der Frühen Neuzeit (pp.97–116). Heidelberg: Winter.
Van de Kauter, M. , Coorman, G. , Lefever, E. , Desmet, B. , Macken, L. , & Hoste, V.
(2013) LeTs Preprocess: The multilingual LT3 linguistic preprocessing toolkit. Computational Linguistics in the Netherlands Journal, 3, 103–120.
Walkden, G.
(2016) The HeliPaD: A parsed corpus of Old Saxon. International Journal of Corpus Linguistics, 21(4), 559–571. doi: 10.1075/ijcl.21.4.05wal
Wallenberg, J. C. , Ingason, A. K. , Sigurðsson, E. F. , & Rögnvaldsson, E.
(2011) Icelandic parsed historical corpus (IcePaHC) (Version 0.9). Available (last accessedMarch 2017).
Yang, Y. , & Eisenstein, J.
(2016) Part-of-speech tagging for historical English. InProceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), San Diego. doi: 10.18653/v1/N16‑1157
This is a required field
Please enter a valid email address