NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Core shrinkage covariance estimation for matrix-variate data

https://doi.org/10.1093/jrsssb/qkad070

Hoff, Peter; McCormack, Andrew; Zhang, Anru R (July 2023, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Abstract A separable covariance model can describe the among-row and among-column correlations of a random matrix and permits likelihood-based inference with a very small sample size. However, if the assumption of separability is not met, data analysis with a separable model may misrepresent important dependence patterns in the data. As a compromise between separable and unstructured covariance estimation, we decompose a covariance matrix into a separable component and a complementary ‘core’ covariance matrix. This decomposition defines a new covariance matrix decomposition that makes use of the parsimony and interpretability of a separable covariance model, yet fully describes covariance matrices that are non-separable. This decomposition motivates a new type of shrinkage estimator, obtained by appropriately shrinking the core of the sample covariance matrix, that adapts to the degree of separability of the population covariance matrix.
more » « less
Full Text Available
TEMPTED: time-informed dimensionality reduction for longitudinal microbiome studies

https://doi.org/10.1186/s13059-024-03453-x

Shi, Pixu; Martino, Cameron; Han, Rungang; Janssen, Stefan; Buck, Gregory; Serrano, Myrna; Owzar, Kouros; Knight, Rob; Shenhav, Liat; Zhang, Anru_R (December 2024, Genome Biology)
Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational gap and their interplay

https://doi.org/10.1214/24-AOS2396

Luo, Yuetian; Zhang, Anru R (December 2024, The Annals of Statistics)

Full Text Available
Mode-wise principal subspace pursuit and matrix spiked covariance model

https://doi.org/10.1093/jrsssb/qkae088

Tang, Runshi; Yuan, Ming; Zhang, Anru R (September 2024, Journal of the Royal Statistical Society Series B: Statistical Methodology)

This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection. These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyse the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.
more » « less
Full Text Available
Soft phenotyping for sepsis via EHR time-aware soft clustering

https://doi.org/10.1016/j.jbi.2024.104615

Jiang, Shiyi; Gai, Xin; Treggiari, Miriam M; Stead, William W; Zhao, Yuankang; Page, C David; Zhang, Anru R (April 2024, Journal of Biomedical Informatics)

Full Text Available
Recursive Importance Sketching for Rank Constrained Least Squares: Algorithms and High-Order Convergence

https://doi.org/10.1287/opre.2023.2445

Luo, Yuetian; Huang, Wen; Li, Xudong; Zhang, Anru (January 2024, Operations Research)

Solving Rank Constrained Least Squares via Recursive Importance Sketching In statistics and machine learning, we sometimes run into the rank-constrained least squares problems, for which we need to find the best low-rank fit between sets of data, such as trying to figure out what factors are affecting the data, filling in missing information, or finding connections between different sets of data. This paper introduces a new method for solving this problem called the recursive importance sketching algorithm (RISRO), in which the central idea is to break the problem down into smaller, easier parts using a unique technique called “recursive importance sketching.” This new method is not only easy to use, but it is also very efficient and gives accurate results. We prove that RISRO converges in a local quadratic-linear and quadratic rate under some mild conditions. Simulation studies also demonstrate the superior performance of RISRO.
more » « less
Full Text Available
Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence

Luo, Yuetian; Zhang, Anru R. (July 2023, Journal of machine learning research)

Full Text Available
Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition

Ni, Chengzhuo; Duan, Yaqi; Dahleh, Munther; Wang, Mengdi; Zhang, Anru R. (February 2023, Journal of machine learning research)

Full Text Available
Phase transition for detecting a small community in a large network

Jin, Jiashun; Ke, Tracy Zheng; Turner, Paxton; Zhang, Anru R. (February 2023, The Eleventh International Conference on Learning and Representations)

Full Text Available
Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit

https://doi.org/10.1111/rssb.12547

Han, Rungang; Luo, Yuetian; Wang, Miaoyan; Zhang, Anru R. (October 2022, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Abstract High-order clustering aims to identify heterogeneous substructures in multiway datasets that arise commonly in neuroimaging, genomics, social network studies, etc. The non-convex and discontinuous nature of this problem pose significant challenges in both statistics and computation. In this paper, we propose a tensor block model and the computationally efficient methods, high-order Lloyd algorithm (HLloyd), and high-order spectral clustering (HSC), for high-order clustering. The convergence guarantees and statistical optimality are established for the proposed procedure under a mild sub-Gaussian noise assumption. Under the Gaussian tensor block model, we completely characterise the statistical-computational trade-off for achieving high-order exact clustering based on three different signal-to-noise ratio regimes. The analysis relies on new techniques of high-order spectral perturbation analysis and a ‘singular-value-gap-free’ error bound in tensor estimation, which are substantially different from the matrix spectral analyses in the literature. Finally, we show the merits of the proposed procedures via extensive experiments on both synthetic and real datasets.
more » « less
Full Text Available

« Prev Next »

Search for: All records