License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ISAAC.2017.42
URN: urn:nbn:de:0030-drops-82316
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/8231/
Go to the corresponding LIPIcs Volume Portal


Gudmundsson, Joachim ; Pagh, Rasmus

Range-Efficient Consistent Sampling and Locality-Sensitive Hashing for Polygons

pdf-format:
LIPIcs-ISAAC-2017-42.pdf (0.5 MB)


Abstract

Locality-sensitive hashing (LSH) is a fundamental technique for similarity search and similarity estimation in high-dimensional spaces.
The basic idea is that similar objects should produce hash collisions with probability significantly larger than objects with low similarity.
We consider LSH for objects that can be represented as point sets in either one or two dimensions.
To make the point sets finite size we consider the subset of points on a grid.
Directly applying LSH (e.g. min-wise hashing) to these point sets would require time proportional to the number of points.
We seek to achieve time that is much lower than direct approaches.

Technically, we introduce new primitives for range-efficient consistent sampling (of independent interest), and show how to turn such samples into LSH values.
Another application of our technique is a data structure for quickly estimating the size of the intersection or union of a set of preprocessed polygons.
Curiously, our consistent sampling method uses transformation to a geometric problem.

BibTeX - Entry

@InProceedings{gudmundsson_et_al:LIPIcs:2017:8231,
  author =	{Joachim Gudmundsson and Rasmus Pagh},
  title =	{{Range-Efficient Consistent Sampling and Locality-Sensitive Hashing for Polygons}},
  booktitle =	{28th International Symposium on Algorithms and Computation (ISAAC 2017)},
  pages =	{42:1--42:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-054-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{92},
  editor =	{Yoshio Okamoto and Takeshi Tokuyama},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/8231},
  URN =		{urn:nbn:de:0030-drops-82316},
  doi =		{10.4230/LIPIcs.ISAAC.2017.42},
  annote =	{Keywords: Locality-sensitive hashing, probability distribution, polygon, min-wise hashing, consistent sampling}
}

Keywords: Locality-sensitive hashing, probability distribution, polygon, min-wise hashing, consistent sampling
Collection: 28th International Symposium on Algorithms and Computation (ISAAC 2017)
Issue Date: 2017
Date of publication: 07.12.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI