An \ell _p oblivious subspace embedding is a distribution over r \times n matrices \Pi such that for any fixed n \times d matrix A , \[ \Pr _{\Pi }[\textrm {for all }x, \ \Vert Ax\Vert _p \le \Vert \Pi Ax\Vert _p \le \kappa \Vert Ax\Vert _p] \ge 9/10,\] where r is the dimension of the embedding, \kappa is the distortion of the embedding, and for an n -dimensional vector y , \Vert y\Vert _p = (\sum _{i=1}^n |y_i|^p)^{1/p} is the \ell _p -norm. Another important property is the sparsity of \Pi , that is, the maximum number of non-zero entries per column, as this determines the running time of computing \Pi A . While for p = 2 there are nearly optimal tradeoffs in terms of the dimension, distortion, and sparsity, for the important case of 1 \le p \lt 2 , much less was known. In this article, we obtain nearly optimal tradeoffs for \ell _1 oblivious subspace embeddings, as well as new tradeoffs for 1 \lt p \lt 2 . Our main results are as follows: (1) We show for every 1 \le p \lt 2 , any oblivious subspace embedding with dimension r has distortion \[more »
Fast Regression for Structured Inputs
We study the $\ell_p$ regression problem, which requires finding $\mathbf{x}\in\mathbb R^{d}$ that minimizes $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p$ for a matrix $\mathbf{A}\in\mathbb R^{n \times d}$ and response vector $\mathbf{b}\in\mathbb R^{n}$. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when $n$ is very large. However, all known subsampling approaches have run time that depends exponentially on $p$, typically, $d^{\mathcal{O}(p)}$, which can be prohibitively expensive.
We improve on this work by showing that for a large class of common \emph{structured matrices}, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for $\ell_p$ regression that depend polynomially on $p$. For example, we give an algorithm for $\ell_p$ regression on Vandermonde matrices that runs in time $\mathcal{O}(n\log^3 n+(dp^2)^{0.5+\omega}\cdot\text{polylog}\,n)$, where $\omega$ is the exponent of matrix multiplication. The polynomial dependence on $p$ crucially allows our algorithms to extend naturally to efficient algorithms for $\ell_\infty$ regression, via approximation of $\ell_\infty$ by $\ell_{\mathcal{O}(\log n)}$. Of practical interest, we also develop a new subsampling algorithm for $\ell_p$ regression for arbitrary matrices, which is simpler than previous approaches for $p \ge 4$.
- Award ID(s):
- 2046235
- Publication Date:
- NSF-PAR ID:
- 10326695
- Journal Name:
- International Conference on Learning Representations (ICLR)
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We give two new quantum algorithms for solving semidefinite programs (SDPs) providing quantum speed-ups. We consider SDP instances with m constraint matrices, each of dimension n, rank at most r, and sparsity s. The first algorithm assumes an input model where one is given access to an oracle to the entries of the matrices at unit cost. We show that it has run time O~(s^2 (sqrt{m} epsilon^{-10} + sqrt{n} epsilon^{-12})), with epsilon the error of the solution. This gives an optimal dependence in terms of m, n and quadratic improvement over previous quantum algorithms (when m ~~ n). The second algorithm assumes a fully quantum input model in which the input matrices are given as quantum states. We show that its run time is O~(sqrt{m}+poly(r))*poly(log m,log n,B,epsilon^{-1}), with B an upper bound on the trace-norm of all input matrices. In particular the complexity depends only polylogarithmically in n and polynomially in r. We apply the second SDP solver to learn a good description of a quantum state with respect to a set of measurements: Given m measurements and a supply of copies of an unknown state rho with rank at most r, we show we can find in time sqrt{m}*poly(logmore »
-
Abstract We present efficient algorithms for counting points on a smooth plane quartic curve X modulo a prime p . We address both the case where X is defined over $${\mathbb {F}}_p$$ F p and the case where X is defined over $${\mathbb {Q}}$$ Q and p is a prime of good reduction. We consider two approaches for computing $$\#X({\mathbb {F}}_p)$$ # X ( F p ) , one which runs in $$O(p\log p\log \log p)$$ O ( p log p log log p ) time using $$O(\log p)$$ O ( log p ) space and one which runs in $$O(p^{1/2}\log ^2p)$$ O ( p 1 / 2 log 2 p ) time using $$O(p^{1/2}\log p)$$ O ( p 1 / 2 log p ) space. Both approaches yield algorithms that are faster in practice than existing methods. We also present average polynomial-time algorithms for $$X/{\mathbb {Q}}$$ X / Q that compute $$\#X({\mathbb {F}}_p)$$ # X ( F p ) for good primes $$p\leqslant N$$ p ⩽ N in $$O(N\log ^3 N)$$ O ( N log 3 N ) time using O ( N ) space. These are the first practical implementations of average polynomial-time algorithms for curves that aremore »
-
Abstract We show how to construct a $$(1+\varepsilon )$$ ( 1 + ε ) -spanner over a set $${P}$$ P of n points in $${\mathbb {R}}^d$$ R d that is resilient to a catastrophic failure of nodes. Specifically, for prescribed parameters $${\vartheta },\varepsilon \in (0,1)$$ ϑ , ε ∈ ( 0 , 1 ) , the computed spanner $${G}$$ G has $$\begin{aligned} {{\mathcal {O}}}\bigl (\varepsilon ^{-O(d)} {\vartheta }^{-6} n(\log \log n)^6 \log n \bigr ) \end{aligned}$$ O ( ε - O ( d ) ϑ - 6 n ( log log n ) 6 log n ) edges. Furthermore, for any k , and any deleted set $${{B}}\subseteq {P}$$ B ⊆ P of k points, the residual graph $${G}\setminus {{B}}$$ G \ B is a $$(1+\varepsilon )$$ ( 1 + ε ) -spanner for all the points of $${P}$$ P except for $$(1+{\vartheta })k$$ ( 1 + ϑ ) k of them. No previous constructions, beyond the trivial clique with $${{\mathcal {O}}}(n^2)$$ O ( n 2 ) edges, were known with this resilience property (i.e., only a tiny additional fraction of vertices, $$\vartheta |B|$$ ϑ | B | , lose their distance preserving connectivity). Our construction works by first solvingmore »
-
Abstract We obtain new quantitative estimates on Weyl Law remainders under dynamical assumptions on the geodesic flow. On a smooth compact Riemannian manifold ( M , g ) of dimension n , let $$\Pi _\lambda $$ Π λ denote the kernel of the spectral projector for the Laplacian, $$\mathbb {1}_{[0,\lambda ^2]}(-\Delta _g)$$ 1 [ 0 , λ 2 ] ( - Δ g ) . Assuming only that the set of near periodic geodesics over $${W}\subset M$$ W ⊂ M has small measure, we prove that as $$\lambda \rightarrow \infty $$ λ → ∞ $$\begin{aligned} \int _{{W}} \Pi _\lambda (x,x)dx=(2\pi )^{-n}{{\,\textrm{vol}\,}}_{_{{\mathbb {R}}^n}}\!(B){{\,\textrm{vol}\,}}_g({W})\,\lambda ^n+O\Big (\frac{\lambda ^{n-1}}{\log \lambda }\Big ), \end{aligned}$$ ∫ W Π λ ( x , x ) d x = ( 2 π ) - n vol R n ( B ) vol g ( W ) λ n + O ( λ n - 1 log λ ) , where B is the unit ball. One consequence of this result is that the improved remainder holds on all product manifolds, in particular giving improved estimates for the eigenvalue counting function in the product setup. Our results also include logarithmic gains on asymptotics for the off-diagonal spectral projector $$\Pimore »