skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Approximate Nearest Neighbors in Limited Space
We consider the (1+ϵ)-approximate nearest neighbor search problem: given a set X of n points in a d-dimensional space, build a data structure that, given any query point y, finds a point x∈X whose distance to y is at most (1+ϵ)minx∈X ‖x−y‖ for an accuracy parameter ϵ∈(0,1). Our main result is a data structure that occupies only O(ϵ^−2 n log(n)log(1/ϵ)) bits of space, assuming all point coordinates are integers in the range {−n^O(1)…n^O(1)}, i.e., the coordinates have O(logn) bits of precision. This improves over the best previously known space bound of O(ϵ^−2 n log(n)^2), obtained via the randomized dimensionality reduction method of Johnson and Lindenstrauss (1984). We also consider the more general problem of estimating all distances from a collection of query points to all data points X, and provide almost tight upper and lower bounds for the space complexity of this problem.  more » « less
Award ID(s):
1740751 1447476
PAR ID:
10065203
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Conference on Learning Theory
Page Range / eLocation ID:
2012-2036
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Given a set $$P$$ of $$n$$ points in the plane, we consider the problem of computing the number of points of $$P$$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in the plane can be adapted to this problem. For example, by adapting Matoušek's results, we can build a data structure of $O(n)$ space in $$O(n^{1+\delta})$$ time (for any $$\delta>0$$) so that each query can be answered in $$O(\sqrt{n})$$ time; alternatively, we can build a data structure of $$O(n^2/\log^2 n)$$ space with $$O(n^{1+\delta})$$ preprocessing time (for any $$\delta>0$$) and $$O(\log n)$$ query time. Our techniques lead to improvements for several other classical problems in computational geometry. 1. Given a set of $$n$$ unit disks and a set of $$n$$ points in the plane, the batched unit-disk range counting problem is to compute for each disk the number of points in it. Previous work [Katz and Sharir, 1997] solved the problem in $$O(n^{4/3}\log n)$$ time. We give a new algorithm of $$O(n^{4/3})$$ time, which is optimal as it matches an $$\Omega(n^{4/3})$$-time lower bound. For small $$\chi$$, where $$\chi$$ is the number of pairs of unit disks that intersect, we further improve the algorithm to $$O(n^{2/3}\chi^{1/3}+n^{1+\delta})$$ time, for any $$\delta>0$$. 2. The above result immediately leads to an $$O(n^{4/3})$$ time optimal algorithm for counting the intersecting pairs of circles for a set of $$n$$ unit circles in the plane. The previous best algorithms solve the problem in $$O(n^{4/3}\log n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n^{4/3}\log^{2/3} n)$$ expected time by a randomized algorithm [Agarwal, Pellegrini, and Sharir, 1993]. 3. Given a set $$P$$ of $$n$$ points in the plane and an integer $$k$$, the distance selection problem is to find the $$k$$-th smallest distance among all pairwise distances of $$P$$. The problem can be solved in $$O(n^{4/3}\log^2 n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n\log n+n^{2/3}k^{1/3}\log^{5/3}n)$$ expected time by a randomized algorithm [Chan, 2001]. Our new randomized algorithm runs in $$O(n\log n +n^{2/3}k^{1/3}\log n)$$ expected time. 4. Given a set $$P$$ of $$n$$ points in the plane, the discrete $$2$$-center problem is to compute two smallest congruent disks whose centers are in $$P$$ and whose union covers $$P$$. An $$O(n^{4/3}\log^5 n)$$-time algorithm was known [Agarwal, Sharir, and Welzl, 1998]. Our techniques yield a deterministic algorithm of $$O(n^{4/3}\log^{10/3} n\cdot (\log\log n)^{O(1)})$$ time and a randomized algorithm of $$O(n^{4/3}\log^3 n\cdot (\log\log n)^{1/3})$$ expected time. 
    more » « less
  2. We consider the problem of preprocessing a weighted directed planar graph in order to quickly answer exact distance queries. The main tension in this problem is between space S and query time Q , and since the mid-1990s all results had polynomial time-space tradeoffs, e.g., Q = ~ Θ( n/√ S ) or Q = ~Θ( n 5/2 /S 3/2 ). In this article we show that there is no polynomial tradeoff between time and space and that it is possible to simultaneously achieve almost optimal space n 1+ o (1) and almost optimal query time n o (1) . More precisely, we achieve the following space-time tradeoffs: n 1+ o (1) space and log 2+ o (1) n query time, n log 2+ o (1) n space and n o (1) query time, n 4/3+ o (1) space and log 1+ o (1) n query time. We reduce a distance query to a variety of point location problems in additively weighted Voronoi diagrams and develop new algorithms for the point location problem itself using several partially persistent dynamic tree data structures. 
    more » « less
  3. We present new algorithms for computing many faces in arrangements of lines and segments. Given a set $$S$$ of $$n$$ lines (resp., segments) and a set $$P$$ of $$m$$ points in the plane, the problem is to compute the faces of the arrangements of $$S$$ that contain at least one point of $$P$$. For the line case, we give a deterministic algorithm of $$O(m^{2/3}n^{2/3}\log^{2/3} (n/\sqrt{m})+(m+n)\log n)$$ time. This improves the previously best deterministic algorithm [Agarwal, 1990] by a factor of $$\log^{2.22}n$$ and improves the previously best randomized algorithm [Agarwal, Matoušek, and Schwarzkopf, 1998] by a factor of $$\log^{1/3}n$$ in certain cases (e.g., when $$m=\Theta(n)$$). For the segment case, we present a deterministic algorithm of $$O(n^{2/3}m^{2/3}\log n+\tau(n\alpha^2(n)+n\log m+m)\log n)$$ time, where $$\tau=\min\{\log m,\log (n/\sqrt{m})\}$$ and $$\alpha(n)$$ is the inverse Ackermann function. This improves the previously best deterministic algorithm [Agarwal, 1990] by a factor of $$\log^{2.11}n$$ and improves the previously best randomized algorithm [Agarwal, Matoušek, and Schwarzkopf, 1998] by a factor of $$\log n$$ in certain cases (e.g., when $$m=\Theta(n)$$). We also give a randomized algorithm of $$O(m^{2/3}K^{1/3}\log n+\tau(n\alpha(n)+n\log m+m)\log n\log K)$$ expected time, where $$K$$ is the number of intersections of all segments of $$S$$. In addition, we consider the query version of the problem, that is, preprocess $$S$$ to compute the face of the arrangement of $$S$$ that contains any given query point. We present new results that improve the previous work for both the line and the segment cases. In particulary, for the line case, we build a data structure of $$O(n\log n)$$ space in $$O(n\log n)$$ randomized time, so that the face containing the query point can be obtained in $$O(\sqrt{n\log n})$$ time with high probability (more specifically, the query returns a binary search tree representing the face so that standard binary-search-based queries on the face can be handled in $$O(\log n)$$ time each and the face itself can be output explicitly in time linear in its size). 
    more » « less
  4. null (Ed.)
    We initiate the study of quantum algorithms for escaping from saddle points with provable guarantee. Given a function f : R n → R , our quantum algorithm outputs an ϵ -approximate second-order stationary point using O ~ ( log 2 ⁡ ( n ) / ϵ 1.75 ) queries to the quantum evaluation oracle (i.e., the zeroth-order oracle). Compared to the classical state-of-the-art algorithm by Jin et al. with O ~ ( log 6 ⁡ ( n ) / ϵ 1.75 ) queries to the gradient oracle (i.e., the first-order oracle), our quantum algorithm is polynomially better in terms of log ⁡ n and matches its complexity in terms of 1 / ϵ . Technically, our main contribution is the idea of replacing the classical perturbations in gradient descent methods by simulating quantum wave equations, which constitutes the improvement in the quantum query complexity with log ⁡ n factors for escaping from saddle points. We also show how to use a quantum gradient computation algorithm due to Jordan to replace the classical gradient queries by quantum evaluation queries with the same complexity. Finally, we also perform numerical experiments that support our theoretical findings. 
    more » « less
  5. Kumar, Amit; Ron-Zewi, Noga (Ed.)
    The goal of trace reconstruction is to reconstruct an unknown n-bit string x given only independent random traces of x, where a random trace of x is obtained by passing x through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of x rather than individual traces themselves. Such an algorithm is said to be 𝓁-local if each of its statistical queries corresponds to an 𝓁-junta function over some block of 𝓁 consecutive bits in the trace. Since several - but not all - known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction. In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an Õ(n^{1/5})-local SQ algorithm that makes all its queries with tolerance τ ≥ 2^{-Õ(n^{1/5})}, and also that any Õ(n^{1/5})-local SQ algorithm must make some query with tolerance τ ≤ 2^{-Ω̃(n^{1/5})}. For the average-case problem, we show that there is an O(log n)-local SQ algorithm that makes all its queries with tolerance τ ≥ 1/poly(n), and also that any O(log n)-local SQ algorithm must make some query with tolerance τ ≤ 1/poly(n). 
    more » « less