skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Structured Semidefinite Programming for Recovering Structured Preconditioners
We develop a general framework for finding approximately-optimal preconditioners for solving linear systems. Leveraging this framework we obtain improved runtimes for fundamental preconditioning and linear system solving problems including the following. \begin{itemize} \item \textbf{Diagonal preconditioning.} We give an algorithm which, given positive definite $$\mathbf{K} \in \mathbb{R}^{d \times d}$$ with $$\mathrm{nnz}(\mathbf{K})$$ nonzero entries, computes an $$\epsilon$$-optimal diagonal preconditioner in time $$\widetilde{O}(\mathrm{nnz}(\mathbf{K}) \cdot \mathrm{poly}(\kappa^\star,\epsilon^{-1}))$$, where $$\kappa^\star$$ is the optimal condition number of the rescaled matrix. \item \textbf{Structured linear systems.} We give an algorithm which, given $$\mathbf{M} \in \mathbb{R}^{d \times d}$$ that is either the pseudoinverse of a graph Laplacian matrix or a constant spectral approximation of one, solves linear systems in $$\mathbf{M}$$ in $$\widetilde{O}(d^2)$$ time. \end{itemize} Our diagonal preconditioning results improve state-of-the-art runtimes of $$\Omega(d^{3.5})$$ attained by general-purpose semidefinite programming, and our solvers improve state-of-the-art runtimes of $$\Omega(d^{\omega})$$ where $$\omega > 2.3$$ is the current matrix multiplication constant. We attain our results via new algorithms for a class of semidefinite programs (SDPs) we call \emph{matrix-dictionary approximation SDPs}, which we leverage to solve an associated problem we call \emph{matrix-dictionary recovery}.  more » « less
Award ID(s):
2045590
PAR ID:
10495901
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Curran Associates, Inc.
Date Published:
Journal Name:
Advances in Neural Information Processing Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the $$\ell_p$$ regression problem, which requires finding $$\mathbf{x}\in\mathbb R^{d}$$ that minimizes $$\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p$$ for a matrix $$\mathbf{A}\in\mathbb R^{n \times d}$$ and response vector $$\mathbf{b}\in\mathbb R^{n}$$. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when $$n$$ is very large. However, all known subsampling approaches have run time that depends exponentially on $$p$$, typically, $$d^{\mathcal{O}(p)}$$, which can be prohibitively expensive. We improve on this work by showing that for a large class of common \emph{structured matrices}, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for $$\ell_p$$ regression that depend polynomially on $$p$$. For example, we give an algorithm for $$\ell_p$$ regression on Vandermonde matrices that runs in time $$\mathcal{O}(n\log^3 n+(dp^2)^{0.5+\omega}\cdot\text{polylog}\,n)$$, where $$\omega$$ is the exponent of matrix multiplication. The polynomial dependence on $$p$$ crucially allows our algorithms to extend naturally to efficient algorithms for $$\ell_\infty$$ regression, via approximation of $$\ell_\infty$$ by $$\ell_{\mathcal{O}(\log n)}$$. Of practical interest, we also develop a new subsampling algorithm for $$\ell_p$$ regression for arbitrary matrices, which is simpler than previous approaches for $$p \ge 4$$. 
    more » « less
  2. Abstract Assume $$\mathsf {ZF} + \mathsf {AD}$$ and all sets of reals are Suslin. Let $$\Gamma $$ be a pointclass closed under $$\wedge $$ , $$\vee $$ , $$\forall ^{\mathbb {R}}$$ , continuous substitution, and has the scale property. Let $$\kappa = \delta (\Gamma )$$ be the supremum of the length of prewellorderings on $$\mathbb {R}$$ which belong to $$\Delta = \Gamma \cap \check \Gamma $$ . Let $$\mathsf {club}$$ denote the collection of club subsets of $$\kappa $$ . Then the countable length everywhere club uniformization holds for $$\kappa $$ : For every relation $$R \subseteq {}^{<{\omega _1}}\kappa \times \mathsf {club}$$ with the property that for all $$\ell \in {}^{<{\omega _1}}\kappa $$ and clubs $$C \subseteq D \subseteq \kappa $$ , $$R(\ell ,D)$$ implies $$R(\ell ,C)$$ , there is a uniformization function $$\Lambda : \mathrm {dom}(R) \rightarrow \mathsf {club}$$ with the property that for all $$\ell \in \mathrm {dom}(R)$$ , $$R(\ell ,\Lambda (\ell ))$$ . In particular, under these assumptions, for all $$n \in \omega $$ , $$\boldsymbol {\delta }^1_{2n + 1}$$ satisfies the countable length everywhere club uniformization. 
    more » « less
  3. null (Ed.)
    We study fast algorithms for computing basic properties of an n x n positive semidefinite kernel matrix K corresponding to n points x_1,...,x_n in R^d. In particular, we consider the estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector. These are some of the most basic problems defined over kernel matrices. We show that the sum of matrix entries can be estimated up to a multiplicative factor of 1+epsilon in time sublinear in n and linear in d for many popular kernel functions, including the Gaussian, exponential, and rational quadratic kernels. For these kernels, we also show that the top eigenvalue (and a witnessing approximate eigenvector) can be approximated to a multiplicative factor of 1+epsilon in time sub-quadratic in n and linear in d. Our algorithms represent significant advances in the best known runtimes for these problems. They leverage the positive definiteness of the kernel matrix, along with a recent line of work on efficient kernel density estimation. 
    more » « less
  4. Abstract This paper will study almost everywhere behaviors of functions on partition spaces of cardinals possessing suitable partition properties. Almost everywhere continuity and monotonicity properties for functions on partition spaces will be established. These results will be applied to distinguish the cardinality of certain subsets of the power set of partition cardinals. The following summarizes the main results proved under suitable partition hypotheses.•If$$\kappa $$is a cardinal,$$\epsilon < \kappa $$,$${\mathrm {cof}}(\epsilon ) = \omega $$,$$\kappa \rightarrow _* (\kappa )^{\epsilon \cdot \epsilon }_2$$and$$\Phi : [\kappa ]^\epsilon _* \rightarrow \mathrm {ON}$$, then$$\Phi $$satisfies the almost everywhere short length continuity property: There is a club$$C \subseteq \kappa $$and a$$\delta < \epsilon $$so that for all$$f,g \in [C]^\epsilon _*$$, if$$f \upharpoonright \delta = g \upharpoonright \delta $$and$$\sup (f) = \sup (g)$$, then$$\Phi (f) = \Phi (g)$$.•If$$\kappa $$is a cardinal,$$\epsilon $$is countable,$$\kappa \rightarrow _* (\kappa )^{\epsilon \cdot \epsilon }_2$$holds and$$\Phi : [\kappa ]^\epsilon _* \rightarrow \mathrm {ON}$$, then$$\Phi $$satisfies the strong almost everywhere short length continuity property: There is a club$$C \subseteq \kappa $$and finitely many ordinals$$\delta _0, ..., \delta _k \leq \epsilon $$so that for all$$f,g \in [C]^\epsilon _*$$, if for all$$0 \leq i \leq k$$,$$\sup (f \upharpoonright \delta _i) = \sup (g \upharpoonright \delta _i)$$, then$$\Phi (f) = \Phi (g)$$.•If$$\kappa $$satisfies$$\kappa \rightarrow _* (\kappa )^\kappa _2$$,$$\epsilon \leq \kappa $$and$$\Phi : [\kappa ]^\epsilon _* \rightarrow \mathrm {ON}$$, then$$\Phi $$satisfies the almost everywhere monotonicity property: There is a club$$C \subseteq \kappa $$so that for all$$f,g \in [C]^\epsilon _*$$, if for all$$\alpha < \epsilon $$,$$f(\alpha ) \leq g(\alpha )$$, then$$\Phi (f) \leq \Phi (g)$$.•Suppose dependent choice ($$\mathsf {DC}$$),$${\omega _1} \rightarrow _* ({\omega _1})^{\omega _1}_2$$and the almost everywhere short length club uniformization principle for$${\omega _1}$$hold. Then every function$$\Phi : [{\omega _1}]^{\omega _1}_* \rightarrow {\omega _1}$$satisfies a finite continuity property with respect to closure points: Let$$\mathfrak {C}_f$$be the club of$$\alpha < {\omega _1}$$so that$$\sup (f \upharpoonright \alpha ) = \alpha $$. There is a club$$C \subseteq {\omega _1}$$and finitely many functions$$\Upsilon _0, ..., \Upsilon _{n - 1} : [C]^{\omega _1}_* \rightarrow {\omega _1}$$so that for all$$f \in [C]^{\omega _1}_*$$, for all$$g \in [C]^{\omega _1}_*$$, if$$\mathfrak {C}_g = \mathfrak {C}_f$$and for all$$i < n$$,$$\sup (g \upharpoonright \Upsilon _i(f)) = \sup (f \upharpoonright \Upsilon _i(f))$$, then$$\Phi (g) = \Phi (f)$$.•Suppose$$\kappa $$satisfies$$\kappa \rightarrow _* (\kappa )^\epsilon _2$$for all$$\epsilon < \kappa $$. For all$$\chi < \kappa $$,$$[\kappa ]^{<\kappa }$$does not inject into$${}^\chi \mathrm {ON}$$, the class of$$\chi $$-length sequences of ordinals, and therefore,$$|[\kappa ]^\chi | < |[\kappa ]^{<\kappa }|$$. As a consequence, under the axiom of determinacy$$(\mathsf {AD})$$, these two cardinality results hold when$$\kappa $$is one of the following weak or strong partition cardinals of determinacy:$${\omega _1}$$,$$\omega _2$$,$$\boldsymbol {\delta }_n^1$$(for all$$1 \leq n < \omega $$) and$$\boldsymbol {\delta }^2_1$$(assuming in addition$$\mathsf {DC}_{\mathbb {R}}$$). 
    more » « less
  5. We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include as special cases random spanning tree distributions and determinantal point processes. For a graph $G=(V, E)$, we show how to approximately sample uniformly random spanning trees from $$G$$ in $$\widetilde{O}(\lvert V\rvert)$$\footnote{Throughout, $$\widetilde{O}(\cdot)$$ hides polylogarithmic factors in $$n$$.} time per sample after an initial $$\widetilde{O}(\lvert E\rvert)$$ time preprocessing. This is the first nearly-linear runtime in the output size, which is clearly optimal. For a determinantal point process on $$k$$-sized subsets of a ground set of $$n$$ elements, defined via an $$n\times n$$ kernel matrix, we show how to approximately sample in $$\widetilde{O}(k^\omega)$$ time after an initial $$\widetilde{O}(nk^{\omega-1})$$ time preprocessing, where $$\omega<2.372864$$ is the matrix multiplication exponent. The time to compute just the weight of the output set is simply $$\simeq k^\omega$$, a natural barrier that suggests our runtime might be optimal for determinantal point processes as well. As a corollary, we even improve the state of the art for obtaining a single sample from a determinantal point process, from the prior runtime of $$\widetilde{O}(\min\{nk^2, n^\omega\})$$ to $$\widetilde{O}(nk^{\omega-1})$$. In our main technical result, we achieve the optimal limit on domain sparsification for strongly Rayleigh distributions. In domain sparsification, sampling from a distribution $$\mu$$ on $$\binom{[n]}{k}$$ is reduced to sampling from related distributions on $$\binom{[t]}{k}$$ for $$t\ll n$$. We show that for strongly Rayleigh distributions, the domain size can be reduced to nearly linear in the output size $$t=\widetilde{O}(k)$$, improving the state of the art from $$t= \widetilde{O}(k^2)$$ for general strongly Rayleigh distributions and the more specialized $$t=\widetilde{O}(k^{1.5})$$ for spanning tree distributions. Our reduction involves sampling from $$\widetilde{O}(1)$$ domain-sparsified distributions, all of which can be produced efficiently assuming approximate overestimates for marginals of $$\mu$$ are known and stored in a convenient data structure. Having access to marginals is the discrete analog of having access to the mean and covariance of a continuous distribution, or equivalently knowing ``isotropy'' for the distribution, the key behind optimal samplers in the continuous setting based on the famous Kannan-Lov\'asz-Simonovits (KLS) conjecture. We view our result as analogous in spirit to the KLS conjecture and its consequences for sampling, but rather for discrete strongly Rayleigh measures. 
    more » « less