Tight Bounds for ℓ 1 Oblivious Subspace Embeddings
An \ell _p oblivious subspace embedding is a distribution over r \times n matrices \Pi such that for any fixed n \times d matrix A , $\Pr _{\Pi }[\textrm {for all }x, \ \Vert Ax\Vert _p \le \Vert \Pi Ax\Vert _p \le \kappa \Vert Ax\Vert _p] \ge 9/10,$ where r is the dimension of the embedding, \kappa is the distortion of the embedding, and for an n -dimensional vector y , \Vert y\Vert _p = (\sum _{i=1}^n |y_i|^p)^{1/p} is the \ell _p -norm. Another important property is the sparsity of \Pi , that is, the maximum number of non-zero entries per column, as this determines the running time of computing \Pi A . While for p = 2 there are nearly optimal tradeoffs in terms of the dimension, distortion, and sparsity, for the important case of 1 \le p \lt 2 , much less was known. In this article, we obtain nearly optimal tradeoffs for \ell _1 oblivious subspace embeddings, as well as new tradeoffs for 1 \lt p \lt 2 . Our main results are as follows: (1) We show for every 1 \le p \lt 2 , any oblivious subspace embedding with dimension r has distortion $\kappa = \Omega \left(\frac{1}{\left(\frac{1}{d}\right)^{1 / p} \log ^{2 / p}r + \left(\frac{r}{n}\right)^{1 / p - 1 / 2}}\right).$ When r = {\operatorname{poly}}(d) \ll n in applications, this gives a \kappa = \Omega (d^{1/p}\log ^{-2/p} d) lower bound, and shows the oblivious subspace embedding of Sohler and Woodruff (STOC, 2011) for p = 1 is optimal up to {\operatorname{poly}}(\log (d)) factors. (2) We give sparse oblivious subspace embeddings for every 1 \le p \lt 2 . Importantly, for p = 1 , we achieve r = O(d \log d) , \kappa = O(d \log d) and s = O(\log d) non-zero entries per column. The best previous construction with s \le {\operatorname{poly}}(\log d) is due to Woodruff and Zhang (COLT, 2013), giving \kappa = \Omega (d^2 {\operatorname{poly}}(\log d)) or \kappa = \Omega (d^{3/2} \sqrt {\log n} \cdot {\operatorname{poly}}(\log d)) and r \ge d \cdot {\operatorname{poly}}(\log d) ; in contrast our r = O(d \log d) and \kappa = O(d \log d) are optimal up to {\operatorname{poly}}(\log (d)) factors even for dense matrices. We also give (1) \ell _p oblivious subspace embeddings with an expected 1+\varepsilon number of non-zero entries per column for arbitrarily small \varepsilon \gt 0 , and (2) the first oblivious subspace embeddings for 1 \le p \lt 2 with O(1) -distortion and dimension independent of n . Oblivious subspace embeddings are crucial for distributed and streaming environments, as well as entrywise \ell _p low-rank approximation. Our results give improved algorithms for these applications.  more » « less
Award ID(s):
NSF-PAR ID:
10423335
Author(s) / Creator(s):
;
Date Published:
Journal Name:
ACM Transactions on Algorithms
Volume:
18
Issue:
1
ISSN:
1549-6325
Page Range / eLocation ID:
1 to 32
Format(s):
Medium: X
National Science Foundation
##### More Like this
1. (Ed.)
Can linear systems be solved faster than matrix multiplication? While there has been remarkable progress for the special cases of graph structured linear systems, in the general setting, the bit complexity of solving an $n \times n$ linear system $Ax=b$ is $\tilde{O}(n^\omega)$, where $\omega < 2.372864$ is the matrix multiplication exponent. Improving on this has been an open problem even for sparse linear systems with poly$(n)$ condition number. In this paper, we present an algorithm that solves linear systems in sparse matrices asymptotically faster than matrix multiplication for any $\omega > 2$. This speedup holds for any input matrix $A$ with $o(n^{\omega -1}/\log(\kappa(A)))$ non-zeros, where $\kappa(A)$ is the condition number of $A$. For poly$(n)$-conditioned matrices with $\tilde{O}(n)$ nonzeros, and the current value of $\omega$, the bit complexity of our algorithm to solve to within any $1/\text{poly}(n)$ error is $O(n^{2.331645})$. Our algorithm can be viewed as an efficient, randomized implementation of the block Krylov method via recursive low displacement rank factorizations. It is inspired by the algorithm of [Eberly et al. ISSAC 06 07] for inverting matrices over finite fields. In our analysis of numerical stability, we develop matrix anti-concentration techniques to bound the smallest eigenvalue and the smallest gap in eigenvalues of semi-random matrices.
more » « less
2. We study the $\ell_p$ regression problem, which requires finding $\mathbf{x}\in\mathbb R^{d}$ that minimizes $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p$ for a matrix $\mathbf{A}\in\mathbb R^{n \times d}$ and response vector $\mathbf{b}\in\mathbb R^{n}$. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when $n$ is very large. However, all known subsampling approaches have run time that depends exponentially on $p$, typically, $d^{\mathcal{O}(p)}$, which can be prohibitively expensive. We improve on this work by showing that for a large class of common \emph{structured matrices}, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for $\ell_p$ regression that depend polynomially on $p$. For example, we give an algorithm for $\ell_p$ regression on Vandermonde matrices that runs in time $\mathcal{O}(n\log^3 n+(dp^2)^{0.5+\omega}\cdot\text{polylog}\,n)$, where $\omega$ is the exponent of matrix multiplication. The polynomial dependence on $p$ crucially allows our algorithms to extend naturally to efficient algorithms for $\ell_\infty$ regression, via approximation of $\ell_\infty$ by $\ell_{\mathcal{O}(\log n)}$. Of practical interest, we also develop a new subsampling algorithm for $\ell_p$ regression for arbitrary matrices, which is simpler than previous approaches for $p \ge 4$.
more » « less
3. Abstract Assume $\mathsf {ZF} + \mathsf {AD}$ and all sets of reals are Suslin. Let $\Gamma$ be a pointclass closed under $\wedge$ , $\vee$ , $\forall ^{\mathbb {R}}$ , continuous substitution, and has the scale property. Let $\kappa = \delta (\Gamma )$ be the supremum of the length of prewellorderings on $\mathbb {R}$ which belong to $\Delta = \Gamma \cap \check \Gamma$ . Let $\mathsf {club}$ denote the collection of club subsets of $\kappa$ . Then the countable length everywhere club uniformization holds for $\kappa$ : For every relation $R \subseteq {}^{<{\omega _1}}\kappa \times \mathsf {club}$ with the property that for all $\ell \in {}^{<{\omega _1}}\kappa$ and clubs $C \subseteq D \subseteq \kappa$ , $R(\ell ,D)$ implies $R(\ell ,C)$ , there is a uniformization function $\Lambda : \mathrm {dom}(R) \rightarrow \mathsf {club}$ with the property that for all $\ell \in \mathrm {dom}(R)$ , $R(\ell ,\Lambda (\ell ))$ . In particular, under these assumptions, for all $n \in \omega$ , $\boldsymbol {\delta }^1_{2n + 1}$ satisfies the countable length everywhere club uniformization.
more » « less
4. Abstract

The Dushnik–Miller dimension of a posetPis the leastdfor whichPcan be embedded into a product ofdchains. Lewis and Souza isibility order on the interval of integers$$[N/\kappa , N]$$$\left[N/\kappa ,N\right]$is bounded above by$$\kappa (\log \kappa )^{1+o(1)}$$$\kappa {\left(log\kappa \right)}^{1+o\left(1\right)}$and below by$$\Omega ((\log \kappa /\log \log \kappa )^2)$$$\Omega \left({\left(log\kappa /loglog\kappa \right)}^{2}\right)$. We improve the upper bound to$$O((\log \kappa )^3/(\log \log \kappa )^2).$$$O\left({\left(log\kappa \right)}^{3}/{\left(loglog\kappa \right)}^{2}\right).$We deduce this bound from a more general result on posets of multisets ordered by inclusion. We also consider other divisibility orders and give a bound for polynomials ordered by divisibility.

more » « less
5. Abstract We obtain new quantitative estimates on Weyl Law remainders under dynamical assumptions on the geodesic flow. On a smooth compact Riemannian manifold ( M ,  g ) of dimension n , let $$\Pi _\lambda$$ Π λ denote the kernel of the spectral projector for the Laplacian, $$\mathbb {1}_{[0,\lambda ^2]}(-\Delta _g)$$ 1 [ 0 , λ 2 ] ( - Δ g ) . Assuming only that the set of near periodic geodesics over $${W}\subset M$$ W ⊂ M has small measure, we prove that as $$\lambda \rightarrow \infty$$ λ → ∞ \begin{aligned} \int _{{W}} \Pi _\lambda (x,x)dx=(2\pi )^{-n}{{\,\textrm{vol}\,}}_{_{{\mathbb {R}}^n}}\!(B){{\,\textrm{vol}\,}}_g({W})\,\lambda ^n+O\Big (\frac{\lambda ^{n-1}}{\log \lambda }\Big ), \end{aligned} ∫ W Π λ ( x , x ) d x = ( 2 π ) - n vol R n ( B ) vol g ( W ) λ n + O ( λ n - 1 log λ ) , where B is the unit ball. One consequence of this result is that the improved remainder holds on all product manifolds, in particular giving improved estimates for the eigenvalue counting function in the product setup. Our results also include logarithmic gains on asymptotics for the off-diagonal spectral projector $$\Pi _\lambda (x,y)$$ Π λ ( x , y ) under the assumption that the set of geodesics that pass near both x and y has small measure, and quantitative improvements for Kuznecov sums under non-looping type assumptions. The key technique used in our study of the spectral projector is that of geodesic beams.
more » « less