skip to main content


Title: Core shrinkage covariance estimation for matrix-variate data
Abstract

A separable covariance model can describe the among-row and among-column correlations of a random matrix and permits likelihood-based inference with a very small sample size. However, if the assumption of separability is not met, data analysis with a separable model may misrepresent important dependence patterns in the data. As a compromise between separable and unstructured covariance estimation, we decompose a covariance matrix into a separable component and a complementary ‘core’ covariance matrix. This decomposition defines a new covariance matrix decomposition that makes use of the parsimony and interpretability of a separable covariance model, yet fully describes covariance matrices that are non-separable. This decomposition motivates a new type of shrinkage estimator, obtained by appropriately shrinking the core of the sample covariance matrix, that adapts to the degree of separability of the population covariance matrix.

 
more » « less
Award ID(s):
2203741
NSF-PAR ID:
10501824
Author(s) / Creator(s):
; ;
Publisher / Repository:
Royal Statistical Society
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
85
Issue:
5
ISSN:
1369-7412
Page Range / eLocation ID:
1659 to 1679
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We derive and study a significance test for determining if a panel of functional time series is separable. In the context of this paper, separability means that the covariance structure factors into the product of two functions, one depending only on time and the other depending only on the coordinates of the panel. Separability is a property which can dramatically improve computational efficiency by substantially reducing model complexity. It is especially useful for functional data as it implies that the functional principal components are the same for each member of the panel. However such an assumption must be verified before proceeding with further inference. Our approach is based on functional norm differences and provides a test with well controlled size and high power. We establish our procedure quite generally, allowing one to test separability of autocovariances as well. In addition to an asymptotic justification, our methodology is validated by a simulation study. It is applied to functional panels of particulate pollution and stock market data. 
    more » « less
  2. Abstract

    Estimation of an unstructured covariance matrix is difficult because of the challenges posed by parameter space dimensionality and the positive‐definiteness constraint that estimates should satisfy. We consider a general nonparametric covariance estimation framework for longitudinal data using the Cholesky decomposition of a positive‐definite matrix. The covariance matrix of time‐ordered measurements is diagonalized by a lower triangular matrix with unconstrained entries that are statistically interpretable as parameters for a varying coefficient autoregressive model. Using this dual interpretation of the Cholesky decomposition and allowing for irregular sampling time points, we treat covariance estimation as bivariate smoothing and cast it in a regularization framework for desired forms of simplicity in covariance models. Viewing stationarity as a form of simplicity or parsimony in covariance, we model the varying coefficient function with components depending on time lag and its orthogonal direction separately and penalize the components that capture the nonstationarity in the fitted function. We demonstrate construction of a covariance estimator using the smoothing spline framework. Simulation studies establish the advantage of our approach over alternative estimators proposed in the longitudinal data setting. We analyze a longitudinal dataset to illustrate application of the methodology and compare our estimates to those resulting from alternative models.

    This article is categorized under:

    Data: Types and Structure > Time Series, Stochastic Processes, and Functional Data

    Statistical and Graphical Methods of Data Analysis > Nonparametric Methods

    Algorithms and Computational Methods > Maximum Likelihood Methods

     
    more » « less
  3. Summary The covariance structure of multivariate functional data can be highly complex, especially if the multivariate dimension is large, making extensions of statistical methods for standard multivariate data to the functional data setting challenging. For example, Gaussian graphical models have recently been extended to the setting of multivariate functional data by applying multivariate methods to the coefficients of truncated basis expansions. However, compared with multivariate data, a key difficulty is that the covariance operator is compact and thus not invertible. This paper addresses the general problem of covariance modelling for multivariate functional data, and functional Gaussian graphical models in particular. As a first step, a new notion of separability for the covariance operator of multivariate functional data is proposed, termed partial separability, leading to a novel Karhunen–Loève-type expansion for such data. Next, the partial separability structure is shown to be particularly useful in providing a well-defined functional Gaussian graphical model that can be identified with a sequence of finite-dimensional graphical models, each of identical fixed dimension. This motivates a simple and efficient estimation procedure through application of the joint graphical lasso. Empirical performance of the proposed method for graphical model estimation is assessed through simulation and analysis of functional brain connectivity during a motor task. 
    more » « less
  4. Abstract

    We propose a novel regularized mixture model for clustering matrix‐valued data. The proposed method assumes a separable covariance structure for each cluster and imposes a sparsity structure (eg, low rankness, spatial sparsity) for the mean signal of each cluster. We formulate the problem as a finite mixture model of matrix‐normal distributions with regularization terms, and then develop an expectation maximization type of algorithm for efficient computation. In theory, we show that the proposed estimators are strongly consistent for various choices of penalty functions. Simulation and two applications on brain signal studies confirm the excellent performance of the proposed method including a better prediction accuracy than the competitors and the scientific interpretability of the solution.

     
    more » « less
  5. Summary

    We propose to model a spatio-temporal random field that has nonstationary covariance structure in both space and time domains by applying the concept of the dimension expansion method in Bornn et al. (2012). Simulations are conducted for both separable and nonseparable space-time covariance models, and the model is also illustrated with a streamflow dataset. Both simulation and data analyses show that modeling nonstationarity in both space and time can improve the predictive performance over stationary covariance models or models that are nonstationary in space but stationary in time.

     
    more » « less