skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 15, 2026

Title: Extending empirical dynamic modeling to cross-sectional data beyond traditional time series
Abstract The foundation of Empirical dynamic modeling (EDM) is in representing time-series data as the trajectory of a dynamic system in a multidimensional state space rather than as a collection of traces of individual variables changing through time. Takens’s theorem provides a rigorous basis for adopting this state-space view of time-series data even from just a single time series, but there is considerable additional value to building out a state space with explicit covariates. Multivariate EDM case studies to-date, however, generally rely on building up understanding first from univariate to multivariate and use lag-coordinate embeddings for critical steps along the path of analysis. Here, we propose an alternative set of steps for multivariate EDM analysis when the traditional roadmap is not practicable. The general approach borrows ideas of random data projection from compressed sensing, but additional justification is described within the framework of Takens’s theorem. We then detail algorithms that implement this alternative method and validate through application to simulated model data. The model demonstrations are constructed to explicitly demonstrate the possibility for this approach to extend EDM application from time-series trajectories to effectively realizations of the underlying vector field, i.e. data sets that measure change over time with very short formal time series but are otherwise “big” in terms of number of variables and samples.  more » « less
Award ID(s):
1660584
PAR ID:
10570330
Author(s) / Creator(s):
; ;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Causal inference in complex systems has been largely promoted by the proposal of some advanced temporal causation models. However, temporal models have serious limitations when time series data are not available or present insignificant variations, which causes a common challenge for earth system science. Meanwhile, there are few spatial causation models for fully exploring the rich spatial cross-sectional data in Earth systems. The generalized embedding theorem proves that observations can be combined together to construct the state space of the dynamic system, and if two variables are from the same dynamic system, they are causally linked. Inspired by this, here we show a Geographical Convergent Cross Mapping (GCCM) model for spatial causal inference with spatial cross-sectional data-based cross-mapping prediction in reconstructed state space. Three typical cases, where clearly existing causations cannot be measured through temporal models, demonstrate that GCCM could detect weak-moderate causations when the correlation is not significant. When the coupling between two variables is significant and strong, GCCM is advantageous in identifying the primary causation direction and better revealing the bidirectional asymmetric causation, overcoming the mirroring effect. 
    more » « less
  2. null (Ed.)
    How can social and health researchers study complex dynamic systems that function in nonlinear and even chaotic ways? Common methods, such as experiments and equation-based models, may be ill-suited to this task. To address the limitations of existing methods and offer nonparametric tools for characterizing and testing causality in nonlinear dynamic systems, we introduce the edm command in Stata. This command implements three key empirical dynamic modeling (EDM) methods for time series and panel data: 1) simplex projection, which characterizes the dimensionality of a system and the degree to which it appears to function deterministically; 2) S-maps, which quantify the degree of nonlinearity in a system; and 3) convergent cross-mapping, which offers a nonparametric approach to modeling causal effects. We illustrate these methods using simulated data on daily Chicago temperature and crime, showing an effect of temperature on crime but not the reverse. We conclude by discussing how EDM allows checking the assumptions of traditional model-based methods, such as residual autocorrelation tests, and we advocate for EDM because it does not assume linearity, stability, or equilibrium. 
    more » « less
  3. Summary Mobile health has emerged as a major success for tracking individual health status, due to the popularity and power of smartphones and wearable devices. This has also brought great challenges in handling heterogeneous, multi-resolution data that arise ubiquitously in mobile health due to irregular multivariate measurements collected from individuals. In this paper, we propose an individualized dynamic latent factor model for irregular multi-resolution time series data to interpolate unsampled measurements of time series with low resolution. One major advantage of the proposed method is the capability to integrate multiple irregular time series and multiple subjects by mapping the multi-resolution data to the latent space. In addition, the proposed individualized dynamic latent factor model is applicable to capturing heterogeneous longitudinal information through individualized dynamic latent factors. Our theory provides a bound on the integrated interpolation error and the convergence rate for B-spline approximation methods. Both the simulation studies and the application to smartwatch data demonstrate the superior performance of the proposed method compared to existing methods. 
    more » « less
  4. We introduce VISTA, a clustering approach for multivariate and irregularly sampled time series based on a parametric state space mixture model. VISTA is specifically designed for the unsupervised identification of groups in datasets originating from healthcare and psychology where such sampling issues are commonplace. Our approach adapts linear Gaussian state space models (LGSSMs) to provide a flexible parametric framework for fitting a wide range of time series dynamics. The clustering approach itself is based on the assumption that the population can be represented as a mixture of a fixed number of LGSSMs. VISTA’s model formulation allows for an explicit derivation of the log-likelihood function, from which we develop an expectation-maximization scheme for fitting model parameters to the observed data samples. Our algorithmic implementation is designed to handle populations of multivariate time series that can exhibit large changes in sampling rate as well as irregular sampling. We evaluate the versatility and accuracy of our approach on simulated and real-world datasets, including demographic trends, wearable sensor data, epidemiological time series, and ecological momentary assessments. Our results indicate that VISTA outperforms most comparable standard times series clustering methods. We provide an open-source implementation of VISTA in Python. 
    more » « less
  5. Classifying multivariate time series (MTS), which record the values of multiple variables over a continuous period of time, has gained a lot of attention. However, existing techniques suffer from two major issues. First, the long-range dependencies of the time-series sequences are not well captured. Second, the interactions of multiple variables are generally not represented in features. To address these aforementioned issues, we propose a novel Cross Attention Stabilized Fully Convolutional Neural Network (CA-SFCN) to classify MTS data. First, we introduce a temporal attention mechanism to extract long- and short-term memories across all time steps. Second, variable attention is designed to select relevant variables at each time step. CA-SFCN is compared with 16 approaches using 14 different MTS datasets. The extensive experimental results show that the CA-SFCN outperforms state-of-the-art classification methods, and the cross attention mechanism achieves better performance than other attention mechanisms. 
    more » « less