License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.LDK.2021.11
URN: urn:nbn:de:0030-drops-145473
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14547/
Go to the corresponding OASIcs Volume Portal


Moreno-Schneider, Julián ; Plakidis, Melina ; Rehm, Georg

Annotation of Fine-Grained Geographical Entities in German Texts

pdf-format:
OASIcs-LDK-2021-11.pdf (0.5 MB)


Abstract

We work on the creation of a corpus, crawled from the internet, on the Berlin district of Moabit, primarily meant for training NER systems in German and English. Typical NER corpora and corresponding systems distinguish persons, organisations and locations, but do not distinguish different types of location entities. For our tourism-inspired use case, we need fine-grained annotations for toponyms. In this paper, we outline the fine-grained classification of geographical entities, the resulting annotations and we present preliminary results on automatically tagging toponyms in a small, bootstrapped gold corpus.

BibTeX - Entry

@InProceedings{morenoschneider_et_al:OASIcs.LDK.2021.11,
  author =	{Moreno-Schneider, Juli\'{a}n and Plakidis, Melina and Rehm, Georg},
  title =	{{Annotation of Fine-Grained Geographical Entities in German Texts}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{11:1--11:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/14547},
  URN =		{urn:nbn:de:0030-drops-145473},
  doi =		{10.4230/OASIcs.LDK.2021.11},
  annote =	{Keywords: Named Entity Recognition, Geographical Entities, Annotation}
}

Keywords: Named Entity Recognition, Geographical Entities, Annotation
Collection: 3rd Conference on Language, Data and Knowledge (LDK 2021)
Issue Date: 2021
Date of publication: 30.08.2021
Supplementary Material: Collection (Collection of documents about Moabit district annotated with Geographical Entities)): https://gitlab.com/jmschnei/Moabit-Collection


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI