skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.


Title: Linear Manifold Modelling of Multivariate Functional Data
Summary

Multivariate functional data are increasingly encountered in data analysis, whereas statistical models for such data are not well developed yet. Motivated by a case-study where one aims to quantify the relationship between various longitudinally recorded behaviour intensities for Drosophila flies, we propose a functional linear manifold model. This model reflects the functional dependence between the components of multivariate random processes and is defined through data-determined linear combinations of the multivariate component trajectories, which are characterized by a set of varying-coefficient functions. The time varying linear relationships that govern the components of multivariate random functions yield insights about the underlying processes and also lead to noise-reduced representations of the multivariate component trajectories. The functional linear manifold model proposed is put to the task for an analysis of longitudinally observed behavioural patterns of flying, feeding, walking and resting over the lifespan of Drosophila flies and is also investigated in simulations.

 
more » « less
PAR ID:
10401333
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
76
Issue:
3
ISSN:
1369-7412
Format(s):
Medium: X Size: p. 605-626
Size(s):
p. 605-626
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Multivariate functional data present theoretical and practical complications that are not found in univariate functional data. One of these is a situation where the component functions of multivariate functional data are positive and are subject to mutual time warping. That is, the component processes exhibit a common shape but are subject to systematic phase variation across their domains in addition to subject‐specific time warping, where each subject has its own internal clock. This motivates a novel model for multivariate functional data that connect such mutual time warping to a latent‐deformation‐based framework by exploiting a novel time‐warping separability assumption. This separability assumption allows for meaningful interpretation and dimension reduction. The resulting latent deformation model is shown to be well suited to represent commonly encountered functional vector data. The proposed approach combines a random amplitude factor for each component with population‐based registration across the components of a multivariate functional data vector and includes a latent population function, which corresponds to a common underlying trajectory. We propose estimators for all components of the model, enabling implementation of the proposed data‐based representation for multivariate functional data and downstream analyses such as Fréchet regression. Rates of convergence are established when curves are fully observed or observed with measurement error. The usefulness of the model, interpretations, and practical aspects are illustrated in simulations and with application to multivariate human growth curves and multivariate environmental pollution data.

     
    more » « less
  2. Summary

    We provide insights into new methodology for the analysis of multilevel binary data observed longitudinally, when the repeated longitudinal measurements are correlated. The proposed model is logistic functional regression conditioned on three latent processes describing the within- and between-variability, and describing the cross-dependence of the repeated longitudinal measurements. We estimate the model components without employing mixed-effects modeling but assuming an approximation to the logistic link function. The primary objectives of this article are to highlight the challenges in the estimation of the model components, to compare two approximations to the logistic regression function, linear and exponential, and to discuss their advantages and limitations. The linear approximation is computationally efficient whereas the exponential approximation applies for rare events functional data. Our methods are inspired by and applied to a scientific experiment on spectral backscatter from long range infrared light detection and ranging (LIDAR) data. The models are general and relevant to many new binary functional data sets, with or without dependence between repeated functional measurements.

     
    more » « less
  3. Summary

    Varying-coefficient linear models arise from multivariate nonparametric regression, non-linear time series modelling and forecasting, functional data analysis, longitudinal data analysis and others. It has been a common practice to assume that the varying coefficients are functions of a given variable, which is often called an index. To enlarge the modelling capacity substantially, this paper explores a class of varying-coefficient linear models in which the index is unknown and is estimated as a linear combination of regressors and/or other variables. We search for the index such that the derived varying-coefficient model provides the least squares approximation to the underlying unknown multidimensional regression function. The search is implemented through a newly proposed hybrid backfitting algorithm. The core of the algorithm is the alternating iteration between estimating the index through a one-step scheme and estimating coefficient functions through one-dimensional local linear smoothing. The locally significant variables are selected in terms of a combined use of the t-statistic and the Akaike information criterion. We further extend the algorithm for models with two indices. Simulation shows that the methodology proposed has appreciable flexibility to model complex multivariate non-linear structure and is practically feasible with average modern computers. The methods are further illustrated through the Canadian mink–muskrat data in 1925–1994 and the pound–dollar exchange rates in 1974–1983.

     
    more » « less
  4. Abstract

    In this article, we introduce a functional structural equation model for estimating directional relations from multivariate functional data. We decouple the estimation into two major steps: directional order determination and selection through sparse functional regression. We first propose a score function at the linear operator level, and show that its minimization can recover the true directional order when the relation between each function and its parental functions is nonlinear. We then develop a sparse functional additive regression, where both the response and the multivariate predictors are functions and the regression relation is additive and nonlinear. We also propose strategies to speed up the computation and scale up our method. In theory, we establish the consistencies of order determination, sparse functional additive regression, and directed acyclic graph estimation, while allowing both the dimension of the Karhunen–Loéve expansion coefficients and the number of random functions to diverge with the sample size. We illustrate the efficacy of our method through simulations, and an application to brain effective connectivity analysis.

     
    more » « less
  5. Summary

    We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).

     
    more » « less