License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICALP.2020.76
URN: urn:nbn:de:0030-drops-124834
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12483/
Go to the corresponding LIPIcs Volume Portal


Laarhoven, Thijs

Polytopes, Lattices, and Spherical Codes for the Nearest Neighbor Problem

pdf-format:
LIPIcs-ICALP-2020-76.pdf (0.9 MB)


Abstract

We study locality-sensitive hash methods for the nearest neighbor problem for the angular distance, focusing on the approach of first projecting down onto a random low-dimensional subspace, and then partitioning the projected vectors according to the Voronoi cells induced by a well-chosen spherical code. This approach generalizes and interpolates between the fast but asymptotically suboptimal hyperplane hashing of Charikar [STOC 2002], and asymptotically optimal but practically often slower hash families of e.g. Andoni - Indyk [FOCS 2006], Andoni - Indyk - Nguyen - Razenshteyn [SODA 2014] and Andoni - Indyk - Laarhoven - Razenshteyn - Schmidt [NIPS 2015]. We set up a framework for analyzing the performance of any spherical code in this context, and we provide results for various codes appearing in the literature, such as those related to regular polytopes and root lattices. Similar to hyperplane hashing, and unlike e.g. cross-polytope hashing, our analysis of collision probabilities and query exponents is exact and does not hide any order terms which vanish only for large d, thus facilitating an easier parameter selection in practical applications.
For the two-dimensional case, we analytically derive closed-form expressions for arbitrary spherical codes, and we show that the equilateral triangle is optimal, achieving a better performance than the two-dimensional analogues of hyperplane and cross-polytope hashing. In three and four dimensions, we numerically find that the tetrahedron and 5-cell (the 3-simplex and 4-simplex) and the 16-cell (the 4-orthoplex) achieve the best query exponents, while in five or more dimensions orthoplices appear to outperform regular simplices, as well as the root lattice families A_k and D_k in terms of minimizing the query exponent. We provide lower bounds based on spherical caps, and we predict that in higher dimensions, larger spherical codes exist which outperform orthoplices in terms of the query exponent, and we argue why using the D_k root lattices will likely lead to better results in practice as well (compared to using cross-polytopes), due to a better trade-off between the asymptotic query exponent and the concrete costs of hashing.

BibTeX - Entry

@InProceedings{laarhoven:LIPIcs:2020:12483,
  author =	{Thijs Laarhoven},
  title =	{{Polytopes, Lattices, and Spherical Codes for the Nearest Neighbor Problem}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{76:1--76:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Artur Czumaj and Anuj Dawar and Emanuela Merelli},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12483},
  URN =		{urn:nbn:de:0030-drops-124834},
  doi =		{10.4230/LIPIcs.ICALP.2020.76},
  annote =	{Keywords: (approximate) nearest neighbor problem, spherical codes, polytopes, lattices, locality-sensitive hashing (LSH)}
}

Keywords: (approximate) nearest neighbor problem, spherical codes, polytopes, lattices, locality-sensitive hashing (LSH)
Collection: 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Issue Date: 2020
Date of publication: 29.06.2020


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI