License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.LDK.2019.14
URN: urn:nbn:de:0030-drops-103787
Go to the corresponding OASIcs Volume Portal

Weichselbraun, Albert ; Kuntschik, Philipp ; Brasoveanu, Adrian M. P.

Name Variants for Improving Entity Discovery and Linking

OASIcs-LDK-2019-14.pdf (1.0 MB)


Identifying all names that refer to a particular set of named entities is a challenging task, as quite often we need to consider many features that include a lot of variation like abbreviations, aliases, hypocorism, multilingualism or partial matches. Each entity type can also have specific rules for name variances: people names can include titles, country and branch names are sometimes removed from organization names, while locations are often plagued by the issue of nested entities. The lack of a clear strategy for collecting, processing and computing name variants significantly lowers the recall of tasks such as Named Entity Linking and Knowledge Base Population since name variances are frequently used in all kind of textual content.
This paper proposes several strategies to address these issues. Recall can be improved by combining knowledge repositories and by computing additional variances based on algorithmic approaches. Heuristics and machine learning methods then analyze the generated name variances and mark ambiguous names to increase precision. An extensive evaluation demonstrates the effects of integrating these methods into a new Named Entity Linking framework and confirms that systematically considering name variances yields significant performance improvements.

BibTeX - Entry

  author =	{Albert Weichselbraun and Philipp Kuntschik and Adrian M. P. Brasoveanu},
  title =	{{Name Variants for Improving Entity Discovery and Linking}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  pages =	{14:1--14:15},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Maria Eskevich and Gerard de Melo and Christian F{\"a}th and John P. McCrae and Paul Buitelaar and Christian Chiarcos and Bettina Klimek and Milan Dojchinovski},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-103787},
  doi =		{10.4230/OASIcs.LDK.2019.14},
  annote =	{Keywords: Named Entity Linking, Name Variance, Machine Learning, Linked Data}

Keywords: Named Entity Linking, Name Variance, Machine Learning, Linked Data
Collection: 2nd Conference on Language, Data and Knowledge (LDK 2019)
Issue Date: 2019
Date of publication: 16.05.2019

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI