License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ISAAC.2021.64
URN: urn:nbn:de:0030-drops-154974
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/15497/
Ferragina, Paolo ;
Manzini, Giovanni ;
Vinciguerra, Giorgio
Repetition- and Linearity-Aware Rank/Select Dictionaries
Abstract
We revisit the fundamental problem of compressing an integer dictionary that supports efficient rank and select operations by exploiting two kinds of regularities arising in real data: repetitiveness and approximate linearity. Our first contribution is a Lempel-Ziv parsing properly enriched to also capture approximate linearity in the data and still be compressed to the kth order entropy. Our second contribution is a variant of the block tree structure whose space complexity takes advantage of both repetitiveness and approximate linearity, and results highly competitive in time too. Our third and final contribution is an implementation and experimentation of this last data structure, which achieves new space-time trade-offs compared to known data structures that exploit only one of the two regularities.
BibTeX - Entry
@InProceedings{ferragina_et_al:LIPIcs.ISAAC.2021.64,
author = {Ferragina, Paolo and Manzini, Giovanni and Vinciguerra, Giorgio},
title = {{Repetition- and Linearity-Aware Rank/Select Dictionaries}},
booktitle = {32nd International Symposium on Algorithms and Computation (ISAAC 2021)},
pages = {64:1--64:16},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-214-3},
ISSN = {1868-8969},
year = {2021},
volume = {212},
editor = {Ahn, Hee-Kap and Sadakane, Kunihiko},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/15497},
URN = {urn:nbn:de:0030-drops-154974},
doi = {10.4230/LIPIcs.ISAAC.2021.64},
annote = {Keywords: Data compression, Compressed data structures, Entropy}
}