License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICDT.2016.18
URN: urn:nbn:de:0030-drops-57877
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2016/5787/
Go to the corresponding LIPIcs Volume Portal


Assadi, Sepehr ; Khanna, Sanjeev ; Li, Yang ; Tannen, Val

Algorithms for Provisioning Queries and Analytics

pdf-format:
17.pdf (0.6 MB)


Abstract

Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates k hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any k hypotheticals, one can create a poly-size (in k) sketch that can then be used to answer the query under any of the 2^k possible scenarios without accessing the original instance.

In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well as simpler queries that compute count and sum/average of positive values) can be compactly provisioned to provide (multiplicative) approximate answers to an arbitrary precision. In contrast, exact provisioning for each of these statistics requires the sketch size to be exponential in k. We then establish that for any complex query whose logical component is a positive relational algebra query, as long as the numerical component can be compactly provisioned, the complex query itself can be compactly provisioned. On the other hand, introducing negation or recursion in the logical component again requires the sketch size to be exponential in k. While our positive results use algorithms that do not access the original instance after a scenario is known, we prove our lower bounds even for the case when, knowing the scenario, limited access to the instance is allowed.

BibTeX - Entry

@InProceedings{assadi_et_al:LIPIcs:2016:5787,
  author =	{Sepehr Assadi and Sanjeev Khanna and Yang Li and Val Tannen},
  title =	{{Algorithms for Provisioning Queries and Analytics}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Wim Martens and Thomas Zeume},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2016/5787},
  URN =		{urn:nbn:de:0030-drops-57877},
  doi =		{10.4230/LIPIcs.ICDT.2016.18},
  annote =	{Keywords: What-if Analysis, Provisioning, Data Compression, Approximate Query Answering}
}

Keywords: What-if Analysis, Provisioning, Data Compression, Approximate Query Answering
Collection: 19th International Conference on Database Theory (ICDT 2016)
Issue Date: 2016
Date of publication: 14.03.2016


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI