License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.35
URN: urn:nbn:de:0030-drops-188607
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18860/
Go to the corresponding LIPIcs Volume Portal


Karayel, Emin

An Embarrassingly Parallel Optimal-Space Cardinality Estimation Algorithm

pdf-format:
LIPIcs-APPROX35.pdf (1.0 MB)


Abstract

In 2020 Błasiok (ACM Trans. Algorithms 16(2) 3:1-3:28) constructed an optimal space streaming algorithm for the cardinality estimation problem with the space complexity of O(ε^{-2} ln(δ^{-1}) + ln n) where ε, δ and n denote the relative accuracy, failure probability and universe size, respectively. However, his solution requires the stream to be processed sequentially. On the other hand, there are algorithms that admit a merge operation; they can be used in a distributed setting, allowing parallel processing of sections of the stream, and are highly relevant for large-scale distributed applications. The best-known such algorithm, unfortunately, has a space complexity exceeding Ω(ln(δ^{-1}) (ε^{-2} ln ln n + ln n)). This work presents a new algorithm that improves on the solution by Błasiok, preserving its space complexity, but with the benefit that it admits such a merge operation, thus providing an optimal solution for the problem for both sequential and parallel applications. Orthogonally, the new algorithm also improves algorithmically on Błasiok’s solution (even in the sequential setting) by reducing its implementation complexity and requiring fewer distinct pseudo-random objects.

BibTeX - Entry

@InProceedings{karayel:LIPIcs.APPROX/RANDOM.2023.35,
  author =	{Karayel, Emin},
  title =	{{An Embarrassingly Parallel Optimal-Space Cardinality Estimation Algorithm}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{35:1--35:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18860},
  URN =		{urn:nbn:de:0030-drops-188607},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.35},
  annote =	{Keywords: Distinct Elements, Distributed Algorithms, Randomized Algorithms, Expander Graphs, Derandomization, Sketching}
}

Keywords: Distinct Elements, Distributed Algorithms, Randomized Algorithms, Expander Graphs, Derandomization, Sketching
Collection: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)
Issue Date: 2023
Date of publication: 04.09.2023
Supplementary Material: Software: https://isa-afp.org/entries/Distributed_Distinct_Elements.html
Software: https://isa-afp.org/entries/Expander_Graphs.html


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI