Abstract. We describe a Lanczos-based algorithm for approximating the product of a rational matrix function with a vector. This algorithm, which we call the Lanczos method for optimal rational matrix function approximation (Lanczos-OR), returns the optimal approximation from a given Krylov subspace in a norm depending on the rational function’s denominator, and it can be computed using the information from a slightly larger Krylov subspace. We also provide a low-memory implementation which only requires storing a number of vectors proportional to the denominator degree of the rational function. Finally, we show that Lanczos-OR can be used to derive algorithms for computing other matrix functions, including the matrix sign function and quadrature-based rational function approximations. In many cases, it improves on the approximation quality of prior approaches, including the standard Lanczos method, with little additional computational overhead. 
                        more » 
                        « less   
                    This content will become publicly available on December 13, 2025
                            
                            Near-Optimal Approximation of Matrix Functions by the Lanczos Method
                        
                    
    
            Approximating the action of a matrix function $$f(\vec{A})$$ on a vector $$\vec{b}$$ is an increasingly important primitive in machine learning, data science, and statistics, with applications such as sampling high dimensional Gaussians, Gaussian process regression and Bayesian inference, principle component analysis, and approximating Hessian spectral densities. Over the past decade, a number of algorithms enjoying strong theoretical guarantees have been proposed for this task. Many of the most successful belong to a family of algorithms called \emph{Krylov subspace methods}. Remarkably, a classic Krylov subspace method, called the Lanczos method for matrix functions (Lanczos-FA), frequently outperforms newer methods in practice. Our main result is a theoretical justification for this finding: we show that, for a natural class of \emph{rational functions}, Lanczos-FA matches the error of the best possible Krylov subspace method up to a multiplicative approximation factor. The approximation factor depends on the degree of $f(x)$'s denominator and the condition number of $$\vec{A}$$, but not on the number of iterations $$k$$. Our result provides a strong justification for the excellent performance of Lanczos-FA, especially on functions that are well approximated by rationals, such as the matrix square root. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10577176
- Publisher / Repository:
- Advances in Neural Information Processing Systems
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            This paper concerns the theory and development of inexact rational Krylov subspace methods for approximating the action of a function of a matrix f(A) to a column vector b. At each step of the rational Krylov subspace methods, a shifted linear system of equations needs to be solved to enlarge the subspace. For large-scale problems, such a linear system is usually solved approximately by an iterative method. The main question is how to relax the accuracy of these linear solves without negatively affecting the convergence of the approximation of f(A)b. Our insight into this issue is obtained by exploring the residual bounds for the rational Krylov subspace approximations of f(A)b, based on the decaying behavior of the entries in the first column of certain matrices of A restricted to the rational Krylov subspaces. The decay bounds for these entries for both analytic functions and Markov functions can be efficiently and accurately evaluated by appropriate quadrature rules. A heuristic based on these bounds is proposed to relax the tolerances of the linear solves arising in each step of the rational Krylov subspace methods. As the algorithm progresses toward convergence, the linear solves can be performed with increasingly lower accuracy and computational cost. Numerical experiments for large nonsymmetric matrices show the effectiveness of the tolerance relaxation strategy for the inexact linear solves of rational Krylov subspace methods.more » « less
- 
            We study algorithms for approximating the spectral density (i.e., the eigenvalue distribution) of a symmetric matrix A ∈ ℝn×n that is accessed through matrix-vector product queries. Recent work has analyzed popular Krylov subspace methods for this problem, showing that they output an ∈ · || A||2 error approximation to the spectral density in the Wasserstein-1 metric using O (1/∈ ) matrix-vector products. By combining a previously studied Chebyshev polynomial moment matching method with a deflation step that approximately projects off the largest magnitude eigendirections of A before estimating the spectral density, we give an improved error bound of ∈ · σℓ (A) using O (ℓ log n + 1/∈ ) matrix-vector products, where σℓ (A) is the ℓth largest singular value of A. In the common case when A exhibits fast singular value decay and so σℓ (A) « ||A||2, our bound can be much stronger than prior work. We also show that it is nearly tight: any algorithm giving error ∈ · σℓ (A) must use Ω(ℓ + 1/∈ ) matrix-vector products. We further show that the popular Stochastic Lanczos Quadrature (SLQ) method essentially matches the above bound for any choice of parameter ℓ, even though SLQ itself is parameter-free and performs no explicit deflation. Our bound helps to explain the strong practical performance and observed ‘spectrum adaptive’ nature of SLQ, and motivates a simple variant of the method that achieves an even tighter error bound. Technically, our results require a careful analysis of how eigenvalues and eigenvectors are approximated by (block) Krylov subspace methods, which may be of independent interest. Our error bound for SLQ leverages an analysis of the method that views it as an implicit polynomial moment matching method, along with recent results on low-rank approximation with single-vector Krylov methods. We use these results to show that the method can perform ‘implicit deflation’ as part of moment matching.more » « less
- 
            Krylov subspace methods are a ubiquitous tool for computing near-optimal rank kk approximations of large matrices. While "large block" Krylov methods with block size at least kk give the best known theoretical guarantees, block size one (a single vector) or a small constant is often preferred in practice. Despite their popularity, we lack theoretical bounds on the performance of such "small block" Krylov methods for low-rank approximation. We address this gap between theory and practice by proving that small block Krylov methods essentially match all known low-rank approximation guarantees for large block methods. Via a black-box reduction we show, for example, that the standard single vector Krylov method run for t iterations obtains the same spectral norm and Frobenius norm error bounds as a Krylov method with block size ℓ≥kℓ≥k run for O(t/ℓ)O(t/ℓ) iterations, up to a logarithmic dependence on the smallest gap between sequential singular values. That is, for a given number of matrix-vector products, single vector methods are essentially as effective as any choice of large block size. By combining our result with tail-bounds on eigenvalue gaps in random matrices, we prove that the dependence on the smallest singular value gap can be eliminated if the input matrix is perturbed by a small random matrix. Further, we show that single vector methods match the more complex algorithm of [Bakshi et al. `22], which combines the results of multiple block sizes to achieve an improved algorithm for Schatten pp-norm low-rank approximation.more » « less
- 
            A flexible extended Krylov subspace method (F-EKSM) is considered for numerical approximation of the action of a matrix function f(A) to a vector b, where the function f is of Markov type. F-EKSM has the same framework as the extended Krylov subspace method (EKSM), replacing the zero pole in EKSM with a properly chosen fixed nonzero pole. For symmetric positive definite matrices, the optimal fixed pole is derived for F-EKSM to achieve the lowest possible upper bound on the asymptotic convergence factor, which is lower than that of EKSM. The analysis is based on properties of Faber polynomials of A and (I−A/s)−1. For large and sparse matrices that can be handled efficiently by LU factorizations, numerical experiments show that F-EKSM and a variant of RKSM based on a small number of fixed poles outperform EKSM in both storage and runtime, and usually have advantages over adaptive RKSM in runtime.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
