Abstract
Locality Sensitive Hashing (LSH) is an effective method to index a set of points such that we can efficiently find the nearest neighbors of a query point. We extend this method to our novel Setquery LSH (SLSH), such that it can find the nearest neighbors of a set of points, given as a query.
Let s(x,y) be the similarity between two points x and y. We define a similarity between a set Q and a point x by aggregating the similarities s(p,x) for all p∈ Q. For example, we can take s(p,x) to be the angular similarity between p and x (i.e., 1(∠(x,p)/π)), and aggregate by arithmetic or geometric averaging, or taking the lowest similarity.
We develop locality sensitive hash families and data structures for a large set of such arithmetic and geometric averaging similarities, and analyze their collision probabilities. We also establish an analogous framework and hash families for distance functions. Specifically, we give a structure for the euclidean distance aggregated by either averaging or taking the maximum.
We leverage SLSH to solve a geometric extension of the approximate near neighbors problem. In this version, we consider a metric for which the unit ball is an ellipsoid and its orientation is specified with the query.
An important application that motivates our work is group recommendation systems. Such a system embeds movies and users in the same feature space, and the task of recommending a movie for a group to watch together, translates to a setquery Q using an appropriate similarity.
BibTeX  Entry
@InProceedings{kaplan_et_al:LIPIcs:2020:12275,
author = {Haim Kaplan and Jay Tenenbaum},
title = {{Locality Sensitive Hashing for SetQueries, Motivated by Group Recommendations}},
booktitle = {17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)},
pages = {28:128:18},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771504},
ISSN = {18688969},
year = {2020},
volume = {162},
editor = {Susanne Albers},
publisher = {Schloss DagstuhlLeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12275},
URN = {urn:nbn:de:0030drops122756},
doi = {10.4230/LIPIcs.SWAT.2020.28},
annote = {Keywords: Locality sensitive hashing, nearest neighbors, similarity search, group recommendations, distance functions, similarity functions, ellipsoid}
}
Keywords: 

Locality sensitive hashing, nearest neighbors, similarity search, group recommendations, distance functions, similarity functions, ellipsoid 
Collection: 

17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020) 
Issue Date: 

2020 
Date of publication: 

12.06.2020 