License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.SEA.2022.4
URN: urn:nbn:de:0030-drops-165385
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/16538/
Go to the corresponding LIPIcs Volume Portal


Dillinger, Peter C. ; Hübschle-Schneider, Lorenz ; Sanders, Peter ; Walzer, Stefan

Fast Succinct Retrieval and Approximate Membership Using Ribbon

pdf-format:
LIPIcs-SEA-2022-4.pdf (1 MB)


Abstract

A retrieval data structure for a static function f: S → {0,1}^r supports queries that return f(x) for any x ∈ S. Retrieval data structures can be used to implement a static approximate membership query data structure (AMQ), i.e., a Bloom filter alternative, with false positive rate 2^{-r}. The information-theoretic lower bound for both tasks is r|S| bits. While succinct theoretical constructions using (1+o(1))r|S| bits were known, these could not achieve very small overheads in practice because they have an unfavorable space-time tradeoff hidden in the asymptotic costs or because small overheads would only be reached for physically impossible input sizes. With bumped ribbon retrieval (BuRR), we present the first practical succinct retrieval data structure. In an extensive experimental evaluation BuRR achieves space overheads well below 1% while being faster than most previously used retrieval data structures (typically with space overheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead ≥ 44%). This efficiency, including favorable constants, stems from a combination of simplicity, word parallelism, and high locality.
We additionally describe homogeneous ribbon filter AMQs, which are even simpler and faster at the price of slightly larger space overhead.

BibTeX - Entry

@InProceedings{dillinger_et_al:LIPIcs.SEA.2022.4,
  author =	{Dillinger, Peter C. and H\"{u}bschle-Schneider, Lorenz and Sanders, Peter and Walzer, Stefan},
  title =	{{Fast Succinct Retrieval and Approximate Membership Using Ribbon}},
  booktitle =	{20th International Symposium on Experimental Algorithms (SEA 2022)},
  pages =	{4:1--4:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-251-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{233},
  editor =	{Schulz, Christian and U\c{c}ar, Bora},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/16538},
  URN =		{urn:nbn:de:0030-drops-165385},
  doi =		{10.4230/LIPIcs.SEA.2022.4},
  annote =	{Keywords: AMQ, Bloom filter, dictionary, linear algebra, randomized algorithm, retrieval data structure, static function data structure, succinct data structure, perfect hashing}
}

Keywords: AMQ, Bloom filter, dictionary, linear algebra, randomized algorithm, retrieval data structure, static function data structure, succinct data structure, perfect hashing
Collection: 20th International Symposium on Experimental Algorithms (SEA 2022)
Issue Date: 2022
Date of publication: 11.07.2022
Supplementary Material: The code and scripts used in our experiments are available under a permissive license at github:
Software (Source Code): https://github.com/lorenzhs/BuRR archived at: https://archive.softwareheritage.org/swh:1:dir:fa31381ae0e372bb33819932c0bcc1c51dcc0dfa
Software (Source Code): https://github.com/lorenzhs/fastfilter_cpp archived at: https://archive.softwareheritage.org/swh:1:dir:66c22ae4f3cc568d78c7b0d719988984a6688801


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI