Abstract
We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter ε and a set P of n points in ℝ^d where each point is assigned a color from a set C, and the goal is to build a structure such that given any geometric range γ, we can efficiently find a list of approximate heavy hitters in γ∩P, i.e., colors that appear at least ε γ∩P times in γ∩P, as well as their frequencies with an additive error of ε γ∩P. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence S of 1+1/ε weights such that the ith weight in S has approximate rank iεγ∩P, meaning, rank iεγ∩P up to an additive error of εγ∩P. Previously, optimal results were only known in 1D [Wei and Yi, 2011] but a few suboptimal methods were available in higher dimensions [Peyman Afshani and Zhewei Wei, 2017; Pankaj K. Agarwal et al., 2012].
We study the problems for two important classes of geometric ranges: 3D halfspace and 3D dominance queries. It is known that many other important queries can be reduced to these two, e.g., 1D interval stabbing or interval containment, 2D threesided queries, 2D circular as well as 2D knearest neighbors queries. We consider the real RAM model of computation where integer registers of size w bits, w = Θ(log n), are also available. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time O(log n + 1/ε). Note that as the output size is 1/ε, after investing the initial O(log n) searching time, our structure takes on average O(1) time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra log_w(1/ε) (resp. log log_w(1/ε)) factor in 3D (resp. 2D). By spending extra log^O(1)(1/ε) factors in both time and space, we can also support quantile queries.
We remark that it is hopeless to achieve a similar query bound for dimensions 4 or higher unless significant advances are made in the data structure side of theory of geometric approximations.
BibTeX  Entry
@InProceedings{afshani_et_al:LIPIcs.ICALP.2023.7,
author = {Afshani, Peyman and Cheng, Pingan and Basu Roy, Aniket and Wei, Zhewei},
title = {{On Range Summary Queries}},
booktitle = {50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
pages = {7:17:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772785},
ISSN = {18688969},
year = {2023},
volume = {261},
editor = {Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/18059},
URN = {urn:nbn:de:0030drops180590},
doi = {10.4230/LIPIcs.ICALP.2023.7},
annote = {Keywords: Computational Geometry, Range Searching, Random Sampling, Geometric Approximation, Data Structures and Algorithms}
}
Keywords: 

Computational Geometry, Range Searching, Random Sampling, Geometric Approximation, Data Structures and Algorithms 
Collection: 

50th International Colloquium on Automata, Languages, and Programming (ICALP 2023) 
Issue Date: 

2023 
Date of publication: 

05.07.2023 