License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICALP.2020.74
URN: urn:nbn:de:0030-drops-124813
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12481/
Go to the corresponding LIPIcs Volume Portal


Kopelowitz, Tsvi ; Vassilevska Williams, Virginia

Towards Optimal Set-Disjointness and Set-Intersection Data Structures

pdf-format:
LIPIcs-ICALP-2020-74.pdf (0.6 MB)


Abstract

In the online set-disjointness problem the goal is to preprocess a family of sets ℱ, so that given two sets S,S' ∈ ℱ, one can quickly establish whether the two sets are disjoint or not. If N = ∑_{S ∈ ℱ} |S|, then let N^p be the preprocessing time and let N^q be the query time. The most efficient known combinatorial algorithm is a generalization of an algorithm by Cohen and Porat [TCS'10] which has a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, based on the 3SUM hypothesis, there is a conditional lower bound curve of p+2q ≥ 2. Thus, the current state-of-the-art exhibits a large gap.
The online set-intersection problem is the reporting version of the online set-disjointness problem, and given a query, the goal is to report all of the elements in the intersection. When considering algorithms with N^p preprocessing time and N^q +O(op) query time, where op is the size of the output, the combinatorial algorithm for online set-disjointess can be extended to solve online set-intersection with a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, assuming the 3SUM hypothesis, for 0 ≤ q ≤ 2/3 this curve is tight. However, for 2/3 ≤ q < 1 there is no known lower bound.
In this paper we close both gaps by showing the following:
- For online set-disjointness we design an algorithm whose runtime, assuming ω = 2 (where ω is the exponent in the fastest matrix multiplication algorithm), matches the lower bound curve of Kopelowitz et al., for q ≤ 1/3. We then complement the new algorithm by a matching conditional lower bound for q > 1/3 which is based on a natural hypothesis on the time required to detect a triangle in an unbalanced tripartite graph. Remarkably, even if ω > 2, the algorithm matches the lower bound curve of Kopelowitz et al. for p≥ 1.73688 and q ≤ 0.13156.
- For set-intersection, we prove a conditional lower bound that matches the combinatorial upper bound curve for q≥ 1/2 which is based on a hypothesis on the time required to enumerate all triangles in an unbalanced tripartite graph.
- Finally, we design algorithms for detecting and enumerating triangles in unbalanced tripartite graphs which match the lower bounds of the corresponding hypotheses, assuming ω = 2.

BibTeX - Entry

@InProceedings{kopelowitz_et_al:LIPIcs:2020:12481,
  author =	{Tsvi Kopelowitz and Virginia Vassilevska Williams},
  title =	{{Towards Optimal Set-Disjointness and Set-Intersection Data Structures}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{74:1--74:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Artur Czumaj and Anuj Dawar and Emanuela Merelli},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12481},
  URN =		{urn:nbn:de:0030-drops-124813},
  doi =		{10.4230/LIPIcs.ICALP.2020.74},
  annote =	{Keywords: Set-disjointness data structures, Triangle detection, Triangle enumeration, Fine-grained complexity, Fast matrix multiplication}
}

Keywords: Set-disjointness data structures, Triangle detection, Triangle enumeration, Fine-grained complexity, Fast matrix multiplication
Collection: 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Issue Date: 2020
Date of publication: 29.06.2020


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI