Krylov subspace methods are a ubiquitous tool for computing near-optimal rank kk approximations of large matrices. While "large block" Krylov methods with block size at least kk give the best known theoretical guarantees, block size one (a single vector) or a small constant is often preferred in practice. Despite their popularity, we lack theoretical bounds on the performance of such "small block" Krylov methods for low-rank approximation. We address this gap between theory and practice by proving that small block Krylov methods essentially match all known low-rank approximation guarantees for large block methods. Via a black-box reduction we show, for example, that the standard single vector Krylov method run for t iterations obtains the same spectral norm and Frobenius norm error bounds as a Krylov method with block size ℓ≥kℓ≥k run for O(t/ℓ)O(t/ℓ) iterations, up to a logarithmic dependence on the smallest gap between sequential singular values. That is, for a given number of matrix-vector products, single vector methods are essentially as effective as any choice of large block size. By combining our result with tail-bounds on eigenvalue gaps in random matrices, we prove that the dependence on the smallest singular value gap can be eliminated if the input matrix is perturbed by a small random matrix. Further, we show that single vector methods match the more complex algorithm of [Bakshi et al. `22], which combines the results of multiple block sizes to achieve an improved algorithm for Schatten pp-norm low-rank approximation. 
                        more » 
                        « less   
                    
                            
                            New Tools for Smoothed Analysis: Least Singular Value Bounds for Random Matrices with Dependent Entries
                        
                    
    
            We develop new techniques for proving lower bounds on the least singular value of random matrices with limited randomness. The matrices we consider have entries that are given by polynomials of a few underlying base random variables. This setting captures a core technical challenge for obtaining smoothed analysis guarantees in many algorithmic settings. Least singular value bounds often involve showing strong anti-concentration inequalities that are intricate and much less understood compared to concentration (or large deviation) bounds. First, we introduce a general technique for proving anti-concentration that uses well-conditionedness properties of the Jacobian of a polynomial map, and show how to combine this with a hierarchical net argument to prove least singular value bounds. Our second tool is a new statement about least singular values to reason about higher-order lifts of smoothed matrices and the action of linear operators on them. Apart from getting simpler proofs of existing smoothed analysis results, we use these tools to now handle more general families of random matrices. This allows us to produce smoothed analysis guarantees in several previously open settings. These new settings include smoothed analysis guarantees for power sum decompositions and certifying robust entanglement of subspaces, where prior work could only establish least singular value bounds for fully random instances or only show non-robust genericity guarantees. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2008688
- PAR ID:
- 10555342
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400703836
- Page Range / eLocation ID:
- 375 to 386
- Format(s):
- Medium: X
- Location:
- Vancouver BC Canada
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Improving upon results of Rudelson and Vershynin, we establish delocalization bounds for eigenvectors of independent‐entry random matrices. In particular, we show that with high probability every eigenvector is delocalized, meaning any subset of its coordinates carries an appropriate proportion of its mass. Our results hold for random matrices with genuinely complex as well as real entries. As an application of our methods, we also establish delocalization bounds for normal vectors to random hyperplanes. The proofs of our main results rely on a least singular value bound for genuinely complex rectangular random matrices, which generalizes a previous bound due to the first author, and may be of independent interest.more » « less
- 
            Abstract We develop deterministic perturbation bounds for singular values and vectors of orthogonally decomposable tensors, in a spirit similar to classical results for matrices such as those due to Weyl, Davis, Kahan and Wedin. Our bounds demonstrate intriguing differences between matrices and higher order tensors. Most notably, they indicate that for higher order tensors perturbation affects each essential singular value/vector in isolation, and its effect on an essential singular vector does not depend on the multiplicity of its corresponding singular value or its distance from other singular values. Our results can be readily applied and provide a unified treatment to many different problems involving higher order orthogonally decomposable tensors. In particular, we illustrate the implications of our bounds through connected yet seemingly different high-dimensional data analysis tasks: the unsupervised learning scenario of tensor SVD and the supervised task of tensor regression, leading to new insights in both of these settings.more » « less
- 
            Abstract Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size. The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical illustrations demonstrating that our bounds are reasonably sharp.more » « less
- 
            We develop a unified approach to bounding the largest and smallest singular values of an inhomogeneous random rectangular matrix, based on the non-backtracking operator and the Ihara-Bass formula for general random Hermitian matrices with a bipartite block structure. We obtain probabilistic upper (respectively, lower) bounds for the largest (respectively, smallest) singular values of a large rectangular random matrix X. These bounds are given in terms of the maximal and minimal 2-norms of the rows and columns of the variance profile of X. The proofs involve finding probabilistic upper bounds on the spectral radius of an associated non-backtracking matrix B. The two-sided bounds can be applied to the centered adjacency matrix of sparse inhomogeneous Erd˝os-Rényi bipartite graphs for a wide range of sparsity, down to criticality. In particular, for Erd˝os-Rényi bipartite graphs G(n,m, p) with p = ω(log n)/n, and m/n→ y ∈ (0,1), our sharp bounds imply that there are no outliers outside the support of the Marˇcenko-Pastur law almost surely. This result extends the Bai-Yin theorem to sparse rectangular random matrices.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    