Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Tauman Kalai, Yael (Ed.)We introduce and study the communication complexity of computing the inner product of two vectors, where the input is restricted w.r.t. a norm N on the space ℝⁿ. Here, Alice and Bob hold two vectors v,u such that ‖v‖_N ≤ 1 and ‖u‖_{N^*} ≤ 1, where N^* is the dual norm. The goal is to compute their inner product ⟨v,u⟩ up to an ε additive term. The problem is denoted by IP_N, and generalizes important previously studied problems, such as: (1) Computing the expectation 𝔼_{x∼𝒟}[f(x)] when Alice holds 𝒟 and Bob holds f is equivalent to IP_{𝓁₁}. (2) Computing v^TAv where Alice has a symmetric matrix with bounded operator norm (denoted S_∞) and Bob has a vector v where ‖v‖₂ = 1. This problem is complete for quantum communication complexity and is equivalent to IP_{S_∞}. We systematically study IP_N, showing the following results, near tight in most cases: 1) For any symmetric norm N, given ‖v‖_N ≤ 1 and ‖u‖_{N^*} ≤ 1 there is a randomized protocol using 𝒪̃(ε^{-6} log n) bits of communication that returns a value in ⟨u,v⟩±ε with probability 2/3 - we will denote this by ℛ_{ε,1/3}(IP_N) ≤ 𝒪̃(ε^{-6} log n). In a special case where N = 𝓁_p and N^* = 𝓁_q for p^{-1} + q^{-1} = 1, we obtain an improved bound ℛ_{ε,1/3}(IP_{𝓁_p}) ≤ 𝒪(ε^{-2} log n), nearly matching the lower bound ℛ_{ε, 1/3}(IP_{𝓁_p}) ≥ Ω(min(n, ε^{-2})). 2) One way communication complexity ℛ^{→}_{ε,δ}(IP_{𝓁_p}) ≤ 𝒪(ε^{-max(2,p)}⋅ log n/ε), and a nearly matching lower bound ℛ^{→}_{ε, 1/3}(IP_{𝓁_p}) ≥ Ω(ε^{-max(2,p)}) for ε^{-max(2,p)} ≪ n. 3) One way communication complexity ℛ^{→}_{ε,δ}(N) for a symmetric norm N is governed by the distortion of the embedding 𝓁_∞^k into N. Specifically, while a small distortion embedding easily implies a lower bound Ω(k), we show that, conversely, non-existence of such an embedding implies protocol with communication k^𝒪(log log k) log² n. 4) For arbitrary origin symmetric convex polytope P, we show ℛ_{ε,1/3}(IP_{N}) ≤ 𝒪(ε^{-2} log xc(P)), where N is the unique norm for which P is a unit ball, and xc(P) is the extension complexity of P (i.e. the smallest number of inequalities describing some polytope P' s.t. P is projection of P').more » « less
-
The indexing algorithms for the high-dimensional nearest neighbor search (NNS) with the best worst-case guarantees are based on the randomized Locality Sensitive Hashing (LSH), and its derivatives. In practice, many heuristic approaches exist to "learn" the best indexing method in order to speed-up NNS, crucially adapting to the structure of the given dataset. Oftentimes, these heuristics outperform the LSH-based algorithms on real datasets, but, almost always, come at the cost of losing the guarantees of either correctness or robust performance on adversarial queries, or apply to datasets with an assumed extra structure/model. In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms, while optimizing the hashing to the structure of the dataset (think instance-optimal algorithms) for performance on the minimum-performing query. We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically. On the theoretical side, we exhibit a natural setting (dataset model) where our algorithm is much better than the standard theoretical one. On the practical side, we run experiments that show that our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets. https://arxiv.org/abs/2108.05433more » « less