skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates
Driven by a wide range of applications, several principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a uni- fied framework for the statistical analysis of a general structured principal subspace estima- tion problem which includes as special cases sparse PCA/SVD, non-negative PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and up- per bounds are established to characterize the interplay between the information-geometric complexity of the constraint set for the principal subspaces, the signal-to-noise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concern- ing the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the mini- max rates of convergence for those problems, including the previous unknown optimal rates for sparse SVD, non-negative PCA/SVD and subspace constrained PCA/SVD.  more » « less
Award ID(s):
2015259
PAR ID:
10285915
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Journal of machine learning research
Volume:
22
ISSN:
1532-4435
Page Range / eLocation ID:
1-45
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dette, Holger; Lee, Stephen; Pensky, Marianna (Ed.)
    Quantum state tomography, which aims to estimate quantum states that are described by density matrices, plays an important role in quantum science and quantum technology. This paper examines the eigenspace estimation and the reconstruction of large low-rank density matrix based on Pauli measurements. Both ordinary principal component analysis (PCA) and iterative thresholding sparse PCA (ITSPCA) estimators of the eigenspace are studied, and their respective convergence rates are established. In particular, we show that the ITSPCA estimator is rate-optimal. We present the reconstruction of the large low-rank density matrix and obtain its optimal convergence rate by using the ITSPCA estimator. A numerical study is carried out to investigate the finite sample performance of the proposed estimators. 
    more » « less
  2. Recently, a wide range of memory-efficient LLM training algorithms have gained substantial popularity. These methods leverage the low-rank structure of gradients to project optimizer states into a subspace using a projection matrix found by singular value decomposition (SVD). However, convergence of these algorithms is highly dependent on the update rules of their projection matrix. This work provides the first convergence guarantee for arbitrary update rules of projection matrices, generally applicable to optimizers that can be analyzed with Hamiltonian Descent, including common ones such as LION and Adam. Inspired by this theoretical understanding, the authors propose Online Subspace Descent, a new family of subspace descent optimizers that do not rely on SVD. Instead of updating the projection matrix with eigenvectors, Online Subspace Descent updates it with online PCA. This approach is flexible and introduces minimal overhead to training. Experiments show that for pretraining LLaMA models ranging from 60M to 7B parameters on the C4 dataset, Online Subspace Descent achieves lower perplexity and better downstream task performance than state-of-the-art low-rank training methods across settings, narrowing the gap with full-rank baselines. 
    more » « less
  3. We perform a rigorous study of private matrix analysis when only the last π‘Š updates to matrices are considered useful for analysis. We show the existing framework in the non-private setting is not robust to noise required for privacy. We then propose a framework robust to noise and use it to give first efficient π‘œ(π‘Š) space differentially private algorithms for spectral approximation, principal component analysis (PCA), multi-response linear regression, sparse PCA, and non-negative PCA. Prior to our work, no such result was known for sparse and non-negative differentially private PCA even in the static data setting. We also give a lower bound to demonstrate the cost of privacy in the sliding window model. 
    more » « less
  4. Principal Components Analysis (PCA) is a dimension-reduction technique widely used in machine learning and statistics. However, due to the dependence of the principal components on all the dimensions, the components are notoriously hard to interpret. Therefore, a variant known as sparse PCA is often preferred. Sparse PCA learns principal components of the data but enforces that such components must be sparse. This has applications in diverse fields such as computational biology and image processing. To learn sparse principal components, it’s well known that standard PCA will not work, especially in high dimensions, and therefore algorithms for sparse PCA are often studied as a separate endeavor. Various algorithms have been proposed for Sparse PCA over the years, but given how fundamental it is for applications in science, the limits of efficient algorithms are only partially understood. In this work, we study the limits of the powerful Sum of Squares (SoS) family of algorithms for Sparse PCA. SoS algorithms have recently revolutionized robust statistics, leading to breakthrough algorithms for long-standing open problems in machine learning, such as optimally learning mixtures of gaussians, robust clustering, robust regression, etc. Moreover, it is believed to be the optimal robust algorithm for many statistical problems. Therefore, for sparse PCA, it’s plausible that it can beat simpler algorithms such as diagonal thresholding that have been traditionally used. In this work, we show that this is not the case, by exhibiting strong tradeoffs between the number of samples required, the sparsity and the ambient dimension, for which SoS algorithms, even if allowed sub-exponential time, will fail to optimally recover the component. Our results are complemented by known algorithms in literature, thereby painting an almost complete picture of the behavior of efficient algorithms for sparse PCA. Since SoS algorithms encapsulate many algorithmic techniques such as spectral or statistical query algorithms, this solidifies the message that known algorithms are optimal for sparse PCA. Moreover, our techniques are strong enough to obtain similar tradeoffs for Tensor PCA, another important higher order variant of PCA with applications in topic modeling, video processing, etc. 
    more » « less
  5. (Early Access) Effective tissue clutter filtering is critical for non-contrast ultrasound imaging of slow blood flow in small vessels. Independent component analysis (ICA) has been considered by other groups for ultrasound clutter filtering in the past and was shown to be superior to principal component analysis (PCA)-based methods. However, it has not been considered specifically for slow flow applications or revisited since the onset of other slow flow-focused advancements in beamforming and tissue filtering, namely angled plane wave beamforming and full spatiotemporal singular value decomposition (SVD) (i.e., PCA-based) tissue filtering. In this work, we aim to develop a full spatiotemporal ICA-based tissue filtering technique facilitated by plane wave applications and compare it to SVD filtering. We compare ICA and SVD filtering in terms of optimal image quality in simulations and phantoms as well as in terms of optimal correlation to ground truth blood signal in simulations. Additionally, we propose an adaptive blood independent component sorting and selection method. We show that optimal and adaptive ICA can consistently separate blood from tissue better than principal component analysis (PCA)-based methods using simulations and phantoms. Additionally we demonstrate initial in vivo feasibility in ultrasound data of a liver tumor. 
    more » « less