skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust Functional Principal Component Analysis via a Functional Pairwise Spatial Sign Operator
Abstract Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample covariance estimator does not work well if the data exhibits heavy-tailedness or outliers. To address this challenge, a new robust FPCA approach based on a functional pairwise spatial sign (PASS) operator, termed PASS FPCA, is introduced. We propose robust estimation procedures for eigenfunctions and eigenvalues. Theoretical properties of the PASS operator are established, showing that it adopts the same eigenfunctions as the standard covariance operator and also allows recovering ratios between eigenvalues. We also extend the proposed procedure to handle functional data measured with noise. Compared to existing robust FPCA approaches, the proposed PASS FPCA requires weaker distributional assumptions to conserve the eigenspace of the covariance function. Specifically, existing work are often built upon a class of functional elliptical distributions, which requires inherently symmetry. In contrast, we introduce a class of distributions called the weakly functional coordinate symmetry (weakly FCS), which allows for severe asymmetry and is much more flexible than the functional elliptical distribution family. The robustness of the PASS FPCA is demonstrated via extensive simulation studies, especially its advantages in scenarios with nonelliptical distributions. The proposed method was motivated by and applied to analysis of accelerometry data from the Objective Physical Activity and Cardiovascular Health Study, a large-scale epidemiological study to investigate the relationship between objectively measured physical activity and cardiovascular health among older women.  more » « less
Award ID(s):
2019363
PAR ID:
10485800
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrics
Volume:
79
Issue:
2
ISSN:
0006-341X
Format(s):
Medium: X Size: p. 1239-1253
Size(s):
p. 1239-1253
Sponsoring Org:
National Science Foundation
More Like this
  1. Functional Principal Component Analysis (FPCA) has become a widely used dimension reduction tool for functional data analysis. When additional covariates are available, existing FPCA models integrate them either in the mean function or in both the mean function and the covariance function. However, methods of the first kind are not suitable for data that display second-order variation, while those of the second kind are time-consuming and make it difficult to perform subsequent statistical analyses on the dimension-reduced representations. To tackle these issues, we introduce an eigen-adjusted FPCA model that integrates covariates in the covariance function only through its eigenvalues. In particular, different structures on the covariate-specific eigenvalues—corresponding to different practical problems—are discussed to illustrate the model’s flexibility as well as utility. To handle functional observations under … 
    more » « less
  2. Functional data have received significant attention as they frequently appear in modern applications, such as functional magnetic resonance imaging (fMRI) and natural language processing. The infinite-dimensional nature of functional data makes it necessary to use dimension reduction techniques. Most existing techniques, however, rely on the covariance operator, which can be affected by heavy-tailed data and unusual observations. Therefore, in this paper, we consider a robust sliced inverse regression for multivariate elliptical functional data. For that reason, we introduce a new statistical linear operator, called the conditional spatial sign Kendall’s tau covariance operator, which can be seen as an extension of the multivariate Kendall’s tau to both the conditional and functional settings. The new operator is robust to heavy-tailed data and outliers, and hence can provide a robust estimate of the sufficient predictors. We also derive the convergence rates of the proposed estimators for both completely and partially observed data. Finally, we demonstrate the finite sample performance of our estimator using simulation examples and a real dataset based on fMRI. 
    more » « less
  3. We propose a broad class of models for time series of curves (functions) that can be used to quantify near long‐range dependence or near unit root behavior. We establish fundamental properties of these models and rates of consistency for the sample mean function and the sample covariance operator. The latter plays a role analogous to sample cross‐covariances for multivariate time series, but is far more important in the functional setting because its eigenfunctions are used in principal component analysis, which is a major tool in functional data analysis. It is used for dimension reduction of feature extraction. We also establish a central limit theorem for functions following our model. Both the consistency rates and the normalizations in the Central Limit Theorem (CLT) are nonstandard. They reflect the local unit root behavior and the long memory structure at moderate lags. 
    more » « less
  4. Summary The covariance structure of multivariate functional data can be highly complex, especially if the multivariate dimension is large, making extensions of statistical methods for standard multivariate data to the functional data setting challenging. For example, Gaussian graphical models have recently been extended to the setting of multivariate functional data by applying multivariate methods to the coefficients of truncated basis expansions. However, compared with multivariate data, a key difficulty is that the covariance operator is compact and thus not invertible. This paper addresses the general problem of covariance modelling for multivariate functional data, and functional Gaussian graphical models in particular. As a first step, a new notion of separability for the covariance operator of multivariate functional data is proposed, termed partial separability, leading to a novel Karhunen–Loève-type expansion for such data. Next, the partial separability structure is shown to be particularly useful in providing a well-defined functional Gaussian graphical model that can be identified with a sequence of finite-dimensional graphical models, each of identical fixed dimension. This motivates a simple and efficient estimation procedure through application of the joint graphical lasso. Empirical performance of the proposed method for graphical model estimation is assessed through simulation and analysis of functional brain connectivity during a motor task. 
    more » « less
  5. In this paper, we discuss the convergence analysis of the conjugate gradient-based algorithm for the functional linear model in the reproducing kernel Hilbert space framework, utilizing early stopping results in regularization against over-fitting. We establish the convergence rates depending on the regularity condition of the slope function and the decay rate of the eigenvalues of the operator composition of covariance and kernel operator. Our convergence rates match the minimax rate available from the literature. 
    more » « less