We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in \Otil(n^{5/3 }m^{1/3}) time\footnote{The \Otil(\cdot) notation hides \poly(\log n) factors}. The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, O(n^\omega). For the special case of unweighted graphs, this improves upon the best previously known running time of \tilde{O}(\min\{n^{\omega},m\sqrt{n},m^{4/3}\}) for m >> n^{7/4} (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15). The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute \eps-approximate effective resistances for a set SS of vertex pairs via approximate Schur complements in \Otil(m+(n + |S|)\eps^{-2}) time, without using the Johnson-Lindenstrauss lemma which requires \Otil( \min\{(m + |S|)\eps^{-2}, m+n\eps^{-4} +|S|\eps^{-2}\}) time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate.
more »
« less
Balancing Gaussian vectors in high dimension
Motivated by problems in controlled experiments, we study the discrepancy of random matrices with continuous entries where the number of columns $$n$$ is much larger than the number of rows $$m$$. Our first result shows that if $$\omega(1) = m = o(n)$$, a matrix with i.i.d. standard Gaussian entries has discrepancy $$\Theta(\sqrt{n} \, 2^{-n/m})$$ with high probability. This provides sharp guarantees for Gaussian discrepancy in a regime that had not been considered before in the existing literature. Our results also apply to a more general family of random matrices with continuous i.i.d. entries, assuming that $$m = O(n/\log{n})$$. The proof is non-constructive and is an application of the second moment method. Our second result is algorithmic and applies to random matrices whose entries are i.i.d. and have a Lipschitz density. We present a randomized polynomial-time algorithm that achieves discrepancy $$e^{-\Omega(\log^2(n)/m)}$$ with high probability, provided that $$m = O(\sqrt{\log{n}})$$. In the one-dimensional case, this matches the best known algorithmic guarantees due to Karmarkar–Karp. For higher dimensions $$2 \leq m = O(\sqrt{\log{n}})$$, this establishes the first efficient algorithm achieving discrepancy smaller than $$O( \sqrt{m} )$$.
more »
« less
- Award ID(s):
- 1712596
- PAR ID:
- 10219146
- Editor(s):
- Abernethy, Jacob; Agarwal, Agarwal
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 125
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 3455-3486
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We give two new quantum algorithms for solving semidefinite programs (SDPs) providing quantum speed-ups. We consider SDP instances with m constraint matrices, each of dimension n, rank at most r, and sparsity s. The first algorithm assumes an input model where one is given access to an oracle to the entries of the matrices at unit cost. We show that it has run time O~(s^2 (sqrt{m} epsilon^{-10} + sqrt{n} epsilon^{-12})), with epsilon the error of the solution. This gives an optimal dependence in terms of m, n and quadratic improvement over previous quantum algorithms (when m ~~ n). The second algorithm assumes a fully quantum input model in which the input matrices are given as quantum states. We show that its run time is O~(sqrt{m}+poly(r))*poly(log m,log n,B,epsilon^{-1}), with B an upper bound on the trace-norm of all input matrices. In particular the complexity depends only polylogarithmically in n and polynomially in r. We apply the second SDP solver to learn a good description of a quantum state with respect to a set of measurements: Given m measurements and a supply of copies of an unknown state rho with rank at most r, we show we can find in time sqrt{m}*poly(log m,log n,r,epsilon^{-1}) a description of the state as a quantum circuit preparing a density matrix which has the same expectation values as rho on the m measurements, up to error epsilon. The density matrix obtained is an approximation to the maximum entropy state consistent with the measurement data considered in Jaynes' principle from statistical mechanics. As in previous work, we obtain our algorithm by "quantizing" classical SDP solvers based on the matrix multiplicative weight update method. One of our main technical contributions is a quantum Gibbs state sampler for low-rank Hamiltonians, given quantum states encoding these Hamiltonians, with a poly-logarithmic dependence on its dimension, which is based on ideas developed in quantum principal component analysis. We also develop a "fast" quantum OR lemma with a quadratic improvement in gate complexity over the construction of Harrow et al. [Harrow et al., 2017]. We believe both techniques might be of independent interest.more » « less
-
An \ell _p oblivious subspace embedding is a distribution over r \times n matrices \Pi such that for any fixed n \times d matrix A , \[ \Pr _{\Pi }[\textrm {for all }x, \ \Vert Ax\Vert _p \le \Vert \Pi Ax\Vert _p \le \kappa \Vert Ax\Vert _p] \ge 9/10,\] where r is the dimension of the embedding, \kappa is the distortion of the embedding, and for an n -dimensional vector y , \Vert y\Vert _p = (\sum _{i=1}^n |y_i|^p)^{1/p} is the \ell _p -norm. Another important property is the sparsity of \Pi , that is, the maximum number of non-zero entries per column, as this determines the running time of computing \Pi A . While for p = 2 there are nearly optimal tradeoffs in terms of the dimension, distortion, and sparsity, for the important case of 1 \le p \lt 2 , much less was known. In this article, we obtain nearly optimal tradeoffs for \ell _1 oblivious subspace embeddings, as well as new tradeoffs for 1 \lt p \lt 2 . Our main results are as follows: (1) We show for every 1 \le p \lt 2 , any oblivious subspace embedding with dimension r has distortion \[ \kappa = \Omega \left(\frac{1}{\left(\frac{1}{d}\right)^{1 / p} \log ^{2 / p}r + \left(\frac{r}{n}\right)^{1 / p - 1 / 2}}\right).\] When r = {\operatorname{poly}}(d) \ll n in applications, this gives a \kappa = \Omega (d^{1/p}\log ^{-2/p} d) lower bound, and shows the oblivious subspace embedding of Sohler and Woodruff (STOC, 2011) for p = 1 is optimal up to {\operatorname{poly}}(\log (d)) factors. (2) We give sparse oblivious subspace embeddings for every 1 \le p \lt 2 . Importantly, for p = 1 , we achieve r = O(d \log d) , \kappa = O(d \log d) and s = O(\log d) non-zero entries per column. The best previous construction with s \le {\operatorname{poly}}(\log d) is due to Woodruff and Zhang (COLT, 2013), giving \kappa = \Omega (d^2 {\operatorname{poly}}(\log d)) or \kappa = \Omega (d^{3/2} \sqrt {\log n} \cdot {\operatorname{poly}}(\log d)) and r \ge d \cdot {\operatorname{poly}}(\log d) ; in contrast our r = O(d \log d) and \kappa = O(d \log d) are optimal up to {\operatorname{poly}}(\log (d)) factors even for dense matrices. We also give (1) \ell _p oblivious subspace embeddings with an expected 1+\varepsilon number of non-zero entries per column for arbitrarily small \varepsilon \gt 0 , and (2) the first oblivious subspace embeddings for 1 \le p \lt 2 with O(1) -distortion and dimension independent of n . Oblivious subspace embeddings are crucial for distributed and streaming environments, as well as entrywise \ell _p low-rank approximation. Our results give improved algorithms for these applications.more » « less
-
Given a matrix A ∈ ℝn\texttimes{}d and a vector b ∈ ℝn, we consider the regression problem with ℓ∞ guarantees: finding a vector x′ ∈ ℝd such that $$||x'-x^* ||_infty leq frac{epsilon}{sqrt{d}}cdot ||Ax^*-b||_2cdot ||A^dagger||$$, where x* = arg minx∈Rd ||Ax – b||2. One popular approach for solving such ℓ2 regression problem is via sketching: picking a structured random matrix S ∈ ℝm\texttimes{}n with m < n and S A can be quickly computed, solve the "sketched" regression problem arg minx∈ℝd ||S Ax – Sb||2. In this paper, we show that in order to obtain such ℓ∞ guarantee for ℓ2 regression, one has to use sketching matrices that are dense. To the best of our knowledge, this is the first user case in which dense sketching matrices are necessary. On the algorithmic side, we prove that there exists a distribution of dense sketching matrices with m = ε-2d log3(n/δ) such that solving the sketched regression problem gives the ℓ∞ guarantee, with probability at least 1 – δ. Moreover, the matrix S A can be computed in time O(nd log n). Our row count is nearly-optimal up to logarithmic factors, and significantly improves the result in (Price et al., 2017), in which a superlinear in d rows, m = Ω(ε-2d1+γ) for γ ∈ (0, 1) is required. Moreover, we develop a novel analytical framework for ℓ∞ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property introduced in (Song \& Yu, 2021). Our analysis is much simpler and more general than that of (Price et al., 2017). Leveraging this framework, we extend the ℓ∞ guarantee regression result to dense sketching matrices for computing the fast tensor product of vectors.more » « less
-
null (Ed.)Can linear systems be solved faster than matrix multiplication? While there has been remarkable progress for the special cases of graph structured linear systems, in the general setting, the bit complexity of solving an $$n \times n$$ linear system $Ax=b$ is $$\tilde{O}(n^\omega)$$, where $$\omega < 2.372864$$ is the matrix multiplication exponent. Improving on this has been an open problem even for sparse linear systems with poly$(n)$ condition number. In this paper, we present an algorithm that solves linear systems in sparse matrices asymptotically faster than matrix multiplication for any $$\omega > 2$$. This speedup holds for any input matrix $$A$$ with $$o(n^{\omega -1}/\log(\kappa(A)))$$ non-zeros, where $$\kappa(A)$$ is the condition number of $$A$$. For poly$(n)$-conditioned matrices with $$\tilde{O}(n)$$ nonzeros, and the current value of $$\omega$$, the bit complexity of our algorithm to solve to within any $$1/\text{poly}(n)$$ error is $$O(n^{2.331645})$$. Our algorithm can be viewed as an efficient, randomized implementation of the block Krylov method via recursive low displacement rank factorizations. It is inspired by the algorithm of [Eberly et al. ISSAC `06 `07] for inverting matrices over finite fields. In our analysis of numerical stability, we develop matrix anti-concentration techniques to bound the smallest eigenvalue and the smallest gap in eigenvalues of semi-random matrices.more » « less