skip to main content


Title: Gradient-based sparse principal component analysis with extensions to online learning
Summary

Sparse principal component analysis is an important technique for simultaneous dimensionality reduction and variable selection with high-dimensional data. In this work we combine the unique geometric structure of the sparse principal component analysis problem with recent advances in convex optimization to develop novel gradient-based sparse principal component analysis algorithms. These algorithms enjoy the same global convergence guarantee as the original alternating direction method of multipliers, and can be more efficiently implemented with the rich toolbox developed for gradient methods from the deep learning literature. Most notably, these gradient-based algorithms can be combined with stochastic gradient descent methods to produce efficient online sparse principal component analysis algorithms with provable numerical and statistical performance guarantees. The practical performance and usefulness of the new algorithms are demonstrated in various simulation studies. As an application, we show how the scalability and statistical accuracy of our method enable us to find interesting functional gene groups in high-dimensional RNA sequencing data.

 
more » « less
NSF-PAR ID:
10413351
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrika
Volume:
110
Issue:
2
ISSN:
0006-3444
Page Range / eLocation ID:
p. 339-360
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We provide a non-asymptotic analysis of the spiked Wishart and Wigner matrix models with a generative neural network prior. Spiked random matrices have the form of a rank-one signal plus noise and have been used as models for high dimensional Principal Component Analysis (PCA), community detection and synchronization over groups. Depending on the prior imposed on the spike, these models can display a statistical-computational gap between the information theoretically optimal reconstruction error that can be achieved with unbounded computational resources and the sub-optimal performances of currently known polynomial time algorithms. These gaps are believed to be fundamental, as in the emblematic case of Sparse PCA. In stark contrast to such cases, we show that there is no statistical-computational gap under a generative network prior, in which the spike lies on the range of a generative neural network. Specifically, we analyze a gradient descent method for minimizing a nonlinear least squares objective over the range of an expansive-Gaussian neural network and show that it can recover in polynomial time an estimate of the underlying spike with a rate-optimal sample complexity and dependence on the noise level. 
    more » « less
  2. Sparse principal component analysis and sparse canonical correlation analysis are two essential techniques from high-dimensional statistics and machine learning for analyzing large-scale data. Both problems can be formulated as an optimization problem with nonsmooth objective and nonconvex constraints. Because nonsmoothness and nonconvexity bring numerical difficulties, most algorithms suggested in the literature either solve some relaxations of them or are heuristic and lack convergence guarantees. In this paper, we propose a new alternating manifold proximal gradient method to solve these two high-dimensional problems and provide a unified convergence analysis. Numerical experimental results are reported to demonstrate the advantages of our algorithm. 
    more » « less
  3. Kutalik, Zoltán (Ed.)

    Epigenetic researchers often evaluate DNA methylation as a potential mediator of the effect of social/environmental exposures on a health outcome. Modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large multi-ethnic cohort in the United States, while providing an R package for their seamless implementation and adoption. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model (BSLMM) and high-dimensional mediation analysis (HDMA); while the preferred methods for estimating the global mediation effect are high-dimensional linear mediation analysis (HILMA) and principal component mediation analysis (PCMA). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.

     
    more » « less
  4. Summary

    In longitudinal data analysis one frequently encounters non-Gaussian data that are repeatedly collected for a sample of individuals over time. The repeated observations could be binomial, Poisson or of another discrete type or could be continuous. The timings of the repeated measurements are often sparse and irregular. We introduce a latent Gaussian process model for such data, establishing a connection to functional data analysis. The functional methods proposed are non-parametric and computationally straightforward as they do not involve a likelihood. We develop functional principal components analysis for this situation and demonstrate the prediction of individual trajectories from sparse observations. This method can handle missing data and leads to predictions of the functional principal component scores which serve as random effects in this model. These scores can then be used for further statistical analysis, such as inference, regression, discriminant analysis or clustering. We illustrate these non-parametric methods with longitudinal data on primary biliary cirrhosis and show in simulations that they are competitive in comparisons with generalized estimating equations and generalized linear mixed models.

     
    more » « less
  5. Abstract Purpose

    Parallel imaging and compressed sensing reconstructions of large MRI datasets often have a prohibitive computational cost that bottlenecks clinical deployment, especially for three‐dimensional (3D) non‐Cartesian acquisitions. One common approach is to reduce the number of coil channels actively used during reconstruction as in coil compression. While effective for Cartesian imaging, coil compression inherently loses signal energy, producing shading artifacts that compromise image quality for 3D non‐Cartesian imaging. We propose coil sketching, a general and versatile method for computationally‐efficient iterative MR image reconstruction.

    Theory and Methods

    We based our method on randomized sketching algorithms, a type of large‐scale optimization algorithms well established in the fields of machine learning and big data analysis. We adapt the sketching theory to the MRI reconstruction problem via a structured sketching matrix that, similar to coil compression, considers high‐energy virtual coils obtained from principal component analysis. But, unlike coil compression, it also considers random linear combinations of the remaining low‐energy coils, effectively leveraging information from all coils.

    Results

    First, we performed ablation experiments to validate the sketching matrix design on both Cartesian and non‐Cartesian datasets. The resulting design yielded both improved computatioanal efficiency and preserved signal‐to‐noise ratio (SNR) as measured by the inverse g‐factor. Then, we verified the efficacy of our approach on high‐dimensional non‐Cartesian 3D cones datasets, where coil sketching yielded up to three‐fold faster reconstructions with equivalent image quality.

    Conclusion

    Coil sketching is a general and versatile reconstruction framework for computationally fast and memory‐efficient reconstruction.

     
    more » « less