skip to main content


Title: Latent deformation models for multivariate functional data and time‐warping separability
Abstract

Multivariate functional data present theoretical and practical complications that are not found in univariate functional data. One of these is a situation where the component functions of multivariate functional data are positive and are subject to mutual time warping. That is, the component processes exhibit a common shape but are subject to systematic phase variation across their domains in addition to subject‐specific time warping, where each subject has its own internal clock. This motivates a novel model for multivariate functional data that connect such mutual time warping to a latent‐deformation‐based framework by exploiting a novel time‐warping separability assumption. This separability assumption allows for meaningful interpretation and dimension reduction. The resulting latent deformation model is shown to be well suited to represent commonly encountered functional vector data. The proposed approach combines a random amplitude factor for each component with population‐based registration across the components of a multivariate functional data vector and includes a latent population function, which corresponds to a common underlying trajectory. We propose estimators for all components of the model, enabling implementation of the proposed data‐based representation for multivariate functional data and downstream analyses such as Fréchet regression. Rates of convergence are established when curves are fully observed or observed with measurement error. The usefulness of the model, interpretations, and practical aspects are illustrated in simulations and with application to multivariate human growth curves and multivariate environmental pollution data.

 
more » « less
Award ID(s):
2014626 2310450
PAR ID:
10419779
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrics
ISSN:
0006-341X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Multivariate functional data are becoming ubiquitous with advances in modern technology and are substantially more complex than univariate functional data. We propose and study a novel model for multivariate functional data where the component processes are subject to mutual time warping. That is, the component processes exhibit a similar shape but are subject to systematic phase variation across their time domains. To address this previously unconsidered mode of warping, we propose new registration methodology that is based on a shift‐warping model. Our method differs from all existing registration methods for functional data in a fundamental way. Namely, instead of focusing on the traditional approach to warping, where one aims to recover individual‐specific registration, we focus on shift registration across the components of a multivariate functional data vector on a population‐wide level. Our proposed estimates for these shifts are identifiable, enjoy parametric rates of convergence, and often have intuitive physical interpretations, all in contrast to traditional curve‐specific registration approaches. We demonstrate the implementation and interpretation of the proposed method by applying our methodology to the Zürich Longitudinal Growth data and study its finite sample properties in simulations.

     
    more » « less
  2. Independent component analysis (ICA) decomposes multivariate data into mutually independent components (ICs). The ICA model is subject to a constraint that at most one of these components is Gaussian, which is required for model identifiability. Linear non‐Gaussian component analysis (LNGCA) generalizes the ICA model to a linear latent factor model with any number of both non‐Gaussian components (signals) and Gaussian components (noise), where observations are linear combinations of independent components. Although the individual Gaussian components are not identifiable, the Gaussian subspace is identifiable. We introduce an estimator along with its optimization approach in which non‐Gaussian and Gaussian components are estimated simultaneously, maximizing the discrepancy of each non‐Gaussian component from Gaussianity while minimizing the discrepancy of each Gaussian component from Gaussianity. When the number of non‐Gaussian components is unknown, we develop a statistical test to determine it based on resampling and the discrepancy of estimated components. Through a variety of simulation studies, we demonstrate the improvements of our estimator over competing estimators, and we illustrate the effectiveness of our test to determine the number of non‐Gaussian components. Further, we apply our method to real data examples and show its practical value.

     
    more » « less
  3. We derive and study a significance test for determining if a panel of functional time series is separable. In the context of this paper, separability means that the covariance structure factors into the product of two functions, one depending only on time and the other depending only on the coordinates of the panel. Separability is a property which can dramatically improve computational efficiency by substantially reducing model complexity. It is especially useful for functional data as it implies that the functional principal components are the same for each member of the panel. However such an assumption must be verified before proceeding with further inference. Our approach is based on functional norm differences and provides a test with well controlled size and high power. We establish our procedure quite generally, allowing one to test separability of autocovariances as well. In addition to an asymptotic justification, our methodology is validated by a simulation study. It is applied to functional panels of particulate pollution and stock market data. 
    more » « less
  4. Summary

    Multivariate functional data are increasingly encountered in data analysis, whereas statistical models for such data are not well developed yet. Motivated by a case-study where one aims to quantify the relationship between various longitudinally recorded behaviour intensities for Drosophila flies, we propose a functional linear manifold model. This model reflects the functional dependence between the components of multivariate random processes and is defined through data-determined linear combinations of the multivariate component trajectories, which are characterized by a set of varying-coefficient functions. The time varying linear relationships that govern the components of multivariate random functions yield insights about the underlying processes and also lead to noise-reduced representations of the multivariate component trajectories. The functional linear manifold model proposed is put to the task for an analysis of longitudinally observed behavioural patterns of flying, feeding, walking and resting over the lifespan of Drosophila flies and is also investigated in simulations.

     
    more » « less
  5. Summary

    The paper is concerned with testing normality in samples of curves and error curves estimated from functional regression models. We propose a general paradigm based on the application of multivariate normality tests to vectors of functional principal components scores. We examine finite sample performance of a number of such tests and select the best performing tests. We apply them to several extensively used functional data sets and determine which can be treated as normal, possibly after a suitable transformation. We also offer practical guidance on software implementations of all tests we study and develop large sample justification for tests based on sample skewness and kurtosis of functional principal component scores.

     
    more » « less