License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.SOCG.2015.436
URN: urn:nbn:de:0030-drops-51181
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2015/5118/
Go to the corresponding LIPIcs Volume Portal


Anagnostopoulos, Evangelos ; Emiris, Ioannis Z. ; Psarros, Ioannis

Low-Quality Dimension Reduction and High-Dimensional Approximate Nearest Neighbor

pdf-format:
35.pdf (0.5 MB)


Abstract

The approximate nearest neighbor problem (epsilon-ANN) in Euclidean settings is a fundamental question, which has been addressed by two main approaches: Data-dependent space partitioning techniques perform well when the dimension is relatively low, but are affected by the curse of dimensionality. On the other hand, locality sensitive hashing has polynomial dependence in the dimension, sublinear query time with an exponent inversely proportional to (1+epsilon)^2, and subquadratic space requirement.

We generalize the Johnson-Lindenstrauss Lemma to define "low-quality" mappings to a Euclidean space of significantly lower dimension, such that they satisfy a requirement weaker than approximately preserving all distances or even preserving the nearest neighbor. This mapping guarantees, with high probability, that an approximate nearest neighbor lies among the k approximate nearest neighbors in the projected space. These can be efficiently retrieved while using only linear storage by a data structure, such as BBD-trees. Our overall algorithm, given n points in dimension d, achieves space usage in O(dn), preprocessing time in O(dn log n), and query time in O(d n^{rho} log n), where rho is proportional to 1 - 1/loglog n, for fixed epsilon in (0, 1). The dimension reduction is larger if one assumes that point sets possess some structure, namely bounded expansion rate. We implement our method and present experimental results in up to 500 dimensions and 10^6 points, which show that the practical performance is better than predicted by the theoretical analysis. In addition, we compare our approach with E2LSH.

BibTeX - Entry

@InProceedings{anagnostopoulos_et_al:LIPIcs:2015:5118,
  author =	{Evangelos Anagnostopoulos and Ioannis Z. Emiris and Ioannis Psarros},
  title =	{{Low-Quality Dimension Reduction and High-Dimensional Approximate Nearest Neighbor}},
  booktitle =	{31st International Symposium on Computational Geometry (SoCG 2015)},
  pages =	{436--450},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-83-5},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{34},
  editor =	{Lars Arge and J{\'a}nos Pach},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2015/5118},
  URN =		{urn:nbn:de:0030-drops-51181},
  doi =		{10.4230/LIPIcs.SOCG.2015.436},
  annote =	{Keywords: Approximate nearest neighbor, Randomized embeddings, Curse of dimensionality, Johnson-Lindenstrauss Lemma, Bounded expansion rate, Experimental study}
}

Keywords: Approximate nearest neighbor, Randomized embeddings, Curse of dimensionality, Johnson-Lindenstrauss Lemma, Bounded expansion rate, Experimental study
Collection: 31st International Symposium on Computational Geometry (SoCG 2015)
Issue Date: 2015
Date of publication: 12.06.2015


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI