License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICDT.2017.17
URN: urn:nbn:de:0030-drops-70489
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7048/
Go to the corresponding LIPIcs Volume Portal


Kimelfeld, Benny ; Livshits, Ester ; Peterfreund, Liat

Detecting Ambiguity in Prioritized Database Repairing

pdf-format:
LIPIcs-ICDT-2017-17.pdf (0.6 MB)


Abstract

In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way." Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one.

Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense.

Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion.

In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish Pi-2-p-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate.

BibTeX - Entry

@InProceedings{kimelfeld_et_al:LIPIcs:2017:7048,
  author =	{Benny Kimelfeld and Ester Livshits and Liat Peterfreund},
  title =	{{Detecting Ambiguity in Prioritized Database Repairing}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{17:1--17:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Michael Benedikt and Giorgio Orsi},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/7048},
  URN =		{urn:nbn:de:0030-drops-70489},
  doi =		{10.4230/LIPIcs.ICDT.2017.17},
  annote =	{Keywords: inconsistent databases, preferred repairs, data cleaning, functional dependencies, conflict hypergraph}
}

Keywords: inconsistent databases, preferred repairs, data cleaning, functional dependencies, conflict hypergraph
Collection: 20th International Conference on Database Theory (ICDT 2017)
Issue Date: 2017
Date of publication: 17.03.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI