License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICALP.2021.11
URN: urn:nbn:de:0030-drops-140803
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14080/
Agarwal, Pankaj K. ;
Hu, Xiao ;
Sintos, Stavros ;
Yang, Jun
Dynamic Enumeration of Similarity Joins
Abstract
This paper considers enumerating answers to similarity-join queries under dynamic updates: Given two sets of n points A,B in ℝ^d, a metric ϕ(⋅), and a distance threshold r > 0, report all pairs of points (a, b) ∈ A × B with ϕ(a,b) ≤ r. Our goal is to store A,B into a dynamic data structure that, whenever asked, can enumerate all result pairs with worst-case delay guarantee, i.e., the time between enumerating two consecutive pairs is bounded. Furthermore, the data structure can be efficiently updated when a point is inserted into or deleted from A or B.
We propose several efficient data structures for answering similarity-join queries in low dimension. For exact enumeration of similarity join, we present near-linear-size data structures for ?₁, ?_∞ metrics with log^{O(1)} n update time and delay. We show that such a data structure is not feasible for the ?₂ metric for d ≥ 4. For approximate enumeration of similarity join, where the distance threshold is a soft constraint, we obtain a unified linear-size data structure for ?_p metric, with log^{O(1)} n delay and update time. In high dimensions, we present an efficient data structure with worst-case delay-guarantee using locality sensitive hashing (LSH).
BibTeX - Entry
@InProceedings{agarwal_et_al:LIPIcs.ICALP.2021.11,
author = {Agarwal, Pankaj K. and Hu, Xiao and Sintos, Stavros and Yang, Jun},
title = {{Dynamic Enumeration of Similarity Joins}},
booktitle = {48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
pages = {11:1--11:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-195-5},
ISSN = {1868-8969},
year = {2021},
volume = {198},
editor = {Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/14080},
URN = {urn:nbn:de:0030-drops-140803},
doi = {10.4230/LIPIcs.ICALP.2021.11},
annote = {Keywords: dynamic enumeration, similarity joins, worst-case delay guarantee}
}
Keywords: |
|
dynamic enumeration, similarity joins, worst-case delay guarantee |
Collection: |
|
48th International Colloquium on Automata, Languages, and Programming (ICALP 2021) |
Issue Date: |
|
2021 |
Date of publication: |
|
02.07.2021 |