skip to main content

Title: Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates
Driven by a wide range of applications, several principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a uni- fied framework for the statistical analysis of a general structured principal subspace estima- tion problem which includes as special cases sparse PCA/SVD, non-negative PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and up- per bounds are established to characterize the interplay between the information-geometric complexity of the constraint set for the principal subspaces, the signal-to-noise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concern- ing the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the mini- max rates of convergence for those problems, including the previous unknown optimal rates for sparse SVD, non-negative PCA/SVD and subspace constrained PCA/SVD.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Journal of machine learning research
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dette, Holger ; Lee, Stephen ; Pensky, Marianna (Ed.)
    Quantum state tomography, which aims to estimate quantum states that are described by density matrices, plays an important role in quantum science and quantum technology. This paper examines the eigenspace estimation and the reconstruction of large low-rank density matrix based on Pauli measurements. Both ordinary principal component analysis (PCA) and iterative thresholding sparse PCA (ITSPCA) estimators of the eigenspace are studied, and their respective convergence rates are established. In particular, we show that the ITSPCA estimator is rate-optimal. We present the reconstruction of the large low-rank density matrix and obtain its optimal convergence rate by using the ITSPCA estimator. A numerical study is carried out to investigate the finite sample performance of the proposed estimators. 
    more » « less
  2. The matrix sensing problem is an important low-rank optimization problem that has found a wide range of applications, such as matrix completion, phase synchornization/retrieval, robust principal component analysis (PCA), and power system state estimation. In this work, we focus on the general matrix sensing problem with linear measurements that are corrupted by random noise. We investigate the scenario where the search rank r is equal to the true rank [Formula: see text] of the unknown ground truth (the exact parametrized case), as well as the scenario where r is greater than [Formula: see text] (the overparametrized case). We quantify the role of the restricted isometry property (RIP) in shaping the landscape of the nonconvex factorized formulation and assisting with the success of local search algorithms. First, we develop a global guarantee on the maximum distance between an arbitrary local minimizer of the nonconvex problem and the ground truth under the assumption that the RIP constant is smaller than [Formula: see text]. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. More importantly, we prove that this noisy, overparametrized problem exhibits the strict saddle property, which leads to the global convergence of perturbed gradient descent algorithm in polynomial time. The results of this work provide a comprehensive understanding of the geometric landscape of the matrix sensing problem in the noisy and overparametrized regime.

    Funding: This work was supported by grants from the National Science Foundation, Office of Naval Research, Air Force Office of Scientific Research, and Army Research Office.

    more » « less
  3. We propose a unified framework to solve general low-rank plus sparse matrix recovery problems based on matrix factorization, which covers a broad family of objective functions satisfying the restricted strong convexity and smoothness conditions. Based on projected gradient descent and the double thresholding operator, our proposed generic algorithm is guaranteed to converge to the unknown low-rank and sparse matrices at a locally linear rate, while matching the best-known robustness guarantee (i.e., tolerance for sparsity). At the core of our theory is a novel structural Lipschitz gradient condition for low-rank plus sparse matrices, which is essential for proving the linear convergence rate of our algorithm, and we believe is of independent interest to prove fast rates for general superposition-structured models. We illustrate the application of our framework through two concrete examples: robust matrix sensing and robust PCA. Empirical experiments corroborate our theory. 
    more » « less
  4. Principal Components Analysis (PCA) is a dimension-reduction technique widely used in machine learning and statistics. However, due to the dependence of the principal components on all the dimensions, the components are notoriously hard to interpret. Therefore, a variant known as sparse PCA is often preferred. Sparse PCA learns principal components of the data but enforces that such components must be sparse. This has applications in diverse fields such as computational biology and image processing. To learn sparse principal components, it’s well known that standard PCA will not work, especially in high dimensions, and therefore algorithms for sparse PCA are often studied as a separate endeavor. Various algorithms have been proposed for Sparse PCA over the years, but given how fundamental it is for applications in science, the limits of efficient algorithms are only partially understood. In this work, we study the limits of the powerful Sum of Squares (SoS) family of algorithms for Sparse PCA. SoS algorithms have recently revolutionized robust statistics, leading to breakthrough algorithms for long-standing open problems in machine learning, such as optimally learning mixtures of gaussians, robust clustering, robust regression, etc. Moreover, it is believed to be the optimal robust algorithm for many statistical problems. Therefore, for sparse PCA, it’s plausible that it can beat simpler algorithms such as diagonal thresholding that have been traditionally used. In this work, we show that this is not the case, by exhibiting strong tradeoffs between the number of samples required, the sparsity and the ambient dimension, for which SoS algorithms, even if allowed sub-exponential time, will fail to optimally recover the component. Our results are complemented by known algorithms in literature, thereby painting an almost complete picture of the behavior of efficient algorithms for sparse PCA. Since SoS algorithms encapsulate many algorithmic techniques such as spectral or statistical query algorithms, this solidifies the message that known algorithms are optimal for sparse PCA. Moreover, our techniques are strong enough to obtain similar tradeoffs for Tensor PCA, another important higher order variant of PCA with applications in topic modeling, video processing, etc. 
    more » « less
  5. (Early Access) Effective tissue clutter filtering is critical for non-contrast ultrasound imaging of slow blood flow in small vessels. Independent component analysis (ICA) has been considered by other groups for ultrasound clutter filtering in the past and was shown to be superior to principal component analysis (PCA)-based methods. However, it has not been considered specifically for slow flow applications or revisited since the onset of other slow flow-focused advancements in beamforming and tissue filtering, namely angled plane wave beamforming and full spatiotemporal singular value decomposition (SVD) (i.e., PCA-based) tissue filtering. In this work, we aim to develop a full spatiotemporal ICA-based tissue filtering technique facilitated by plane wave applications and compare it to SVD filtering. We compare ICA and SVD filtering in terms of optimal image quality in simulations and phantoms as well as in terms of optimal correlation to ground truth blood signal in simulations. Additionally, we propose an adaptive blood independent component sorting and selection method. We show that optimal and adaptive ICA can consistently separate blood from tissue better than principal component analysis (PCA)-based methods using simulations and phantoms. Additionally we demonstrate initial in vivo feasibility in ultrasound data of a liver tumor. 
    more » « less