License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.SEA.2023.19
URN: urn:nbn:de:0030-drops-183693
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18369/
Go to the corresponding LIPIcs Volume Portal


He, Meng ; Liu, Zhen

Exact and Approximate Range Mode Query Data Structures in Practice

pdf-format:
LIPIcs-SEA-2023-19.pdf (0.9 MB)


Abstract

We conduct an experimental study on the range mode problem. In the exact version of the problem, we preprocess an array A, such that given a query range [a, b], the most frequent element in A[a, b] can be found efficiently. For this problem, our most important finding is that the strategy of using succinct data structures to encode more precomputed information not only helped Chan et al. (Linear-space data structures for range mode query in arrays, Theory of Computing Systems, 2013) improve previous results in theory but also helps us achieve the best time/space tradeoff in practice; we even go a step further to replace more components in their solution with succinct data structures and improve the performance further.
In the approximate version of this problem, a (1+ε)-approximate range mode query looks for an element whose occurrences in A[a,b] is at least F_{a,b}/(1+ε), where F_{a,b} is the frequency of the mode in A[a,b]. We implement all previous solutions to this problems and find that, even when ε = 1/2, the average approximation ratio of these solutions is close to 1 in practice, and they provide much faster query time than the best exact solution. These solutions achieve different useful time-space tradeoffs, and among them, El-Zein et al. (On Approximate Range Mode and Range Selection, 30th International Symposium on Algorithms and Computation, 2019) provide us with one solution whose space usage is only 35.6% to 93.8% of the cost of storing the input array of 32-bit integers (in most cases, the space cost is closer to the lower end, and the average space cost is 20.2 bits per symbol among all datasets). Its non-succinct version also stands out with query support at least several times faster than other O(n/ε)-word structures while using only slightly more space in practice.

BibTeX - Entry

@InProceedings{he_et_al:LIPIcs.SEA.2023.19,
  author =	{He, Meng and Liu, Zhen},
  title =	{{Exact and Approximate Range Mode Query Data Structures in Practice}},
  booktitle =	{21st International Symposium on Experimental Algorithms (SEA 2023)},
  pages =	{19:1--19:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-279-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{265},
  editor =	{Georgiadis, Loukas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18369},
  URN =		{urn:nbn:de:0030-drops-183693},
  doi =		{10.4230/LIPIcs.SEA.2023.19},
  annote =	{Keywords: range mode query, exact range mode query, approximate range mode query}
}

Keywords: range mode query, exact range mode query, approximate range mode query
Collection: 21st International Symposium on Experimental Algorithms (SEA 2023)
Issue Date: 2023
Date of publication: 19.07.2023
Supplementary Material: Software (Source Code): https://github.com/Kolento777/RangeModeQueries archived at: https://archive.softwareheritage.org/swh:1:dir:5d61144576ed7d45a2e424ae08b6b010c1a6e90c


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI