skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2045590

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $$n$$-node undirected graph. We provide a randomized algorithm that, with $$O(n\epsilon^{-2})$$ queries to a degree and neighbor oracle and in $$O(n\epsilon^{-3})$$ time, estimates the spectrum up to $$\epsilon$$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $$O(n\epsilon^{-7})$$ time algorithm from [Braverman et al., STOC 2022] and, for sufficiently small $$\epsilon$$, a $$2^{O(\epsilon^{-1})}$$ time method from [Cohen-Steiner et al., KDD 2018]. To achieve this result, we introduce a new notion of graph sparsification, which we call \emph{nuclear sparsification}. We provide an $$O(n\epsilon^{-2})$$-query and $$O(n\epsilon^{-2})$$-time algorithm for computing $$O(n\epsilon^{-2})$$-sparse nuclear sparsifiers. We show that this bound is optimal in both its sparsity and query complexity, and we separate our results from the related notion of additive spectral sparsification. Of independent interest, we show that our sparsification method also yields the first \emph{deterministic} algorithm for spectral density estimation that scales linearly with $$n$$ (sublinear in the representation size of the graph). 
    more » « less
    Free, publicly-accessible full text available June 30, 2026
  2. Free, publicly-accessible full text available June 30, 2026
  3. Free, publicly-accessible full text available April 24, 2026
  4. Free, publicly-accessible full text available March 31, 2026
  5. Free, publicly-accessible full text available February 1, 2026
  6. We present a new class of preconditioned iterative methods for solving linear systems of the form Ax = b. Our methods are based on constructing a low-rank Nyström approximation to A using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of A, which improves as the rank of the Nyström approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any n × n linear system that is well-conditioned except for k outlying large singular values in Õ (n2.065 + kω) time, improving on a recent result of [Derezmski, Yang, STOC 2024] for all k ≳ n0.78. 2. We give the first Õ (n2 + dλω) time algorithm for solving a regularized linear system (A+λΙ)x = b, where A is positive semidefinite with effective dimension dλ = tr(A(A + λΙ)-1). This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten p-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in Õ (n2.11) time, improving on an Õ (n2.18) method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching. 
    more » « less
    Free, publicly-accessible full text available January 12, 2026
  7. We describe a randomized algorithm for producing a near-optimal hierarchical off-diagonal low-rank (HODLR) approximation to an n × n matrix A, accessible only though matrix-vector products with A and AT. We prove that, for the rank-k HODLR approximation problem, our method achieves a (1 + β )log(n )-optimal approximation in expected Frobenius norm using O (k log(n )/β3) matrix-vector products. In particular, the algorithm obtains a (1 + ∈ )-optimal approximation with O (k log4(n )/∈3) matrix-vector products, and for any constant c, an nc-optimal approximation with O (k log(n )) matrix-vector products. Apart from matrix-vector products, the additional computational cost of our method is just O (n poly(log(n ), k, β )). We complement the upper bound with a lower bound, which shows that any matrix-vector query algorithm requires at least Ω(k log(n ) + k/ε ) queries to obtain a (1 + ε )-optimal approximation. Our algorithm can be viewed as a robust version of widely used “peeling” methods for recovering HODLR matrices and is, to the best of our knowledge, the first matrix-vector query algorithm to enjoy theoretical worst- case guarantees for approximation by any hierarchical matrix class. To control the propagation of error between levels of hierarchical approximation, we introduce a new perturbation bound for low-rank approximation, which shows that the widely used Generalized Nyström method enjoys inherent stability when implemented with noisy matrix-vector products. We also introduce a novel randomly perforated matrix sketching method to further control the error in the peeling algorithm. 
    more » « less
    Free, publicly-accessible full text available January 12, 2026
  8. We study algorithms for approximating the spectral density (i.e., the eigenvalue distribution) of a symmetric matrix A ∈ ℝn×n that is accessed through matrix-vector product queries. Recent work has analyzed popular Krylov subspace methods for this problem, showing that they output an ∈ · || A||2 error approximation to the spectral density in the Wasserstein-1 metric using O (1/∈ ) matrix-vector products. By combining a previously studied Chebyshev polynomial moment matching method with a deflation step that approximately projects off the largest magnitude eigendirections of A before estimating the spectral density, we give an improved error bound of ∈ · σℓ (A) using O (ℓ log n + 1/∈ ) matrix-vector products, where σℓ (A) is the ℓth largest singular value of A. In the common case when A exhibits fast singular value decay and so σℓ (A) « ||A||2, our bound can be much stronger than prior work. We also show that it is nearly tight: any algorithm giving error ∈ · σℓ (A) must use Ω(ℓ + 1/∈ ) matrix-vector products. We further show that the popular Stochastic Lanczos Quadrature (SLQ) method essentially matches the above bound for any choice of parameter ℓ, even though SLQ itself is parameter-free and performs no explicit deflation. Our bound helps to explain the strong practical performance and observed ‘spectrum adaptive’ nature of SLQ, and motivates a simple variant of the method that achieves an even tighter error bound. Technically, our results require a careful analysis of how eigenvalues and eigenvectors are approximated by (block) Krylov subspace methods, which may be of independent interest. Our error bound for SLQ leverages an analysis of the method that views it as an implicit polynomial moment matching method, along with recent results on low-rank approximation with single-vector Krylov methods. We use these results to show that the method can perform ‘implicit deflation’ as part of moment matching. 
    more » « less
    Free, publicly-accessible full text available January 12, 2026
  9. Free, publicly-accessible full text available December 31, 2025
  10. Estimating the effect of treatments from natural experiments, where treatments are pre-assigned, is an important and well-studied problem. We introduce a novel natural experiment dataset obtained from an early childhood literacy nonprofit. Surprisingly, applying over 20 established estimators to the dataset produces inconsistent results in evaluating the nonprofits efficacy. To address this, we create a benchmark to evaluate estimator accuracy using synthetic outcomes, whose design was guided by domain experts. The benchmark extensively explores performance as real world conditions like sample size, treatment correlation, and propensity score accuracy vary. Based on our benchmark, we observe that the class of doubly robust treatment effect estimators, which are based on simple and intuitive regression adjustment, generally outperform other more complicated estimators by orders of magnitude. To better support our theoretical understanding of doubly robust estimators, we derive a closed form expression for the variance of any such estimator that uses dataset splitting to obtain an unbiased estimate. This expression motivates the design of a new doubly robust estimator that uses a novel loss function when fitting functions for regression adjustment. We release the dataset and benchmark in a Python package; the package is built in a modular way to facilitate new datasets and estimators. https://github.com/rtealwitter/naturalexperiments. 
    more » « less
    Free, publicly-accessible full text available December 13, 2025