Flores-Velazco, Alejandro ; Mount, David M.

Boundary-Sensitive Approach for Approximate Nearest-Neighbor Classification

The problem of nearest-neighbor classification is a fundamental technique in machine-learning. Given a training set P of n labeled points in ℝ^d, and an approximation parameter 0 < ε ≤ 1/2, any unlabeled query point should be classified with the class of any of its ε-approximate nearest-neighbors in P. Answering these queries efficiently has been the focus of extensive research, proposing techniques that are mainly tailored towards resolving the more general problem of ε-approximate nearest-neighbor search. While the latest can only hope to provide query time and space complexities dependent on n, the problem of nearest-neighbor classification accepts other parameters more suitable to its analysis. Such is the number k_ε of ε-border points, which describes the complexity of boundaries between sets of points of different classes.
This paper presents a new data structure called Chromatic AVD. This is the first approach for ε-approximate nearest-neighbor classification whose space and query time complexities are only dependent on ε, k_ε and d, while being independent on both n and Δ, the spread of P.

