Abstract Causal inference in complex systems has been largely promoted by the proposal of some advanced temporal causation models. However, temporal models have serious limitations when time series data are not available or present insignificant variations, which causes a common challenge for earth system science. Meanwhile, there are few spatial causation models for fully exploring the rich spatial cross-sectional data in Earth systems. The generalized embedding theorem proves that observations can be combined together to construct the state space of the dynamic system, and if two variables are from the same dynamic system, they are causally linked. Inspired by this, here we show a Geographical Convergent Cross Mapping (GCCM) model for spatial causal inference with spatial cross-sectional data-based cross-mapping prediction in reconstructed state space. Three typical cases, where clearly existing causations cannot be measured through temporal models, demonstrate that GCCM could detect weak-moderate causations when the correlation is not significant. When the coupling between two variables is significant and strong, GCCM is advantageous in identifying the primary causation direction and better revealing the bidirectional asymmetric causation, overcoming the mirroring effect.
more »
« less
This content will become publicly available on January 15, 2026
Extending empirical dynamic modeling to cross-sectional data beyond traditional time series
Abstract The foundation of Empirical dynamic modeling (EDM) is in representing time-series data as the trajectory of a dynamic system in a multidimensional state space rather than as a collection of traces of individual variables changing through time. Takens’s theorem provides a rigorous basis for adopting this state-space view of time-series data even from just a single time series, but there is considerable additional value to building out a state space with explicit covariates. Multivariate EDM case studies to-date, however, generally rely on building up understanding first from univariate to multivariate and use lag-coordinate embeddings for critical steps along the path of analysis. Here, we propose an alternative set of steps for multivariate EDM analysis when the traditional roadmap is not practicable. The general approach borrows ideas of random data projection from compressed sensing, but additional justification is described within the framework of Takens’s theorem. We then detail algorithms that implement this alternative method and validate through application to simulated model data. The model demonstrations are constructed to explicitly demonstrate the possibility for this approach to extend EDM application from time-series trajectories to effectively realizations of the underlying vector field, i.e. data sets that measure change over time with very short formal time series but are otherwise “big” in terms of number of variables and samples.
more »
« less
- Award ID(s):
- 1660584
- PAR ID:
- 10570330
- Publisher / Repository:
- bioRxiv
- Date Published:
- Format(s):
- Medium: X
- Institution:
- bioRxiv
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)How can social and health researchers study complex dynamic systems that function in nonlinear and even chaotic ways? Common methods, such as experiments and equation-based models, may be ill-suited to this task. To address the limitations of existing methods and offer nonparametric tools for characterizing and testing causality in nonlinear dynamic systems, we introduce the edm command in Stata. This command implements three key empirical dynamic modeling (EDM) methods for time series and panel data: 1) simplex projection, which characterizes the dimensionality of a system and the degree to which it appears to function deterministically; 2) S-maps, which quantify the degree of nonlinearity in a system; and 3) convergent cross-mapping, which offers a nonparametric approach to modeling causal effects. We illustrate these methods using simulated data on daily Chicago temperature and crime, showing an effect of temperature on crime but not the reverse. We conclude by discussing how EDM allows checking the assumptions of traditional model-based methods, such as residual autocorrelation tests, and we advocate for EDM because it does not assume linearity, stability, or equilibrium.more » « less
-
Summary Mobile health has emerged as a major success for tracking individual health status, due to the popularity and power of smartphones and wearable devices. This has also brought great challenges in handling heterogeneous, multi-resolution data that arise ubiquitously in mobile health due to irregular multivariate measurements collected from individuals. In this paper, we propose an individualized dynamic latent factor model for irregular multi-resolution time series data to interpolate unsampled measurements of time series with low resolution. One major advantage of the proposed method is the capability to integrate multiple irregular time series and multiple subjects by mapping the multi-resolution data to the latent space. In addition, the proposed individualized dynamic latent factor model is applicable to capturing heterogeneous longitudinal information through individualized dynamic latent factors. Our theory provides a bound on the integrated interpolation error and the convergence rate for B-spline approximation methods. Both the simulation studies and the application to smartwatch data demonstrate the superior performance of the proposed method compared to existing methods.more » « less
-
Classifying multivariate time series (MTS), which record the values of multiple variables over a continuous period of time, has gained a lot of attention. However, existing techniques suffer from two major issues. First, the long-range dependencies of the time-series sequences are not well captured. Second, the interactions of multiple variables are generally not represented in features. To address these aforementioned issues, we propose a novel Cross Attention Stabilized Fully Convolutional Neural Network (CA-SFCN) to classify MTS data. First, we introduce a temporal attention mechanism to extract long- and short-term memories across all time steps. Second, variable attention is designed to select relevant variables at each time step. CA-SFCN is compared with 16 approaches using 14 different MTS datasets. The extensive experimental results show that the CA-SFCN outperforms state-of-the-art classification methods, and the cross attention mechanism achieves better performance than other attention mechanisms.more » « less
-
Abstract Most current algorithms for multivariate time series classification tend to overlook the correlations between time series of different variables. In this research, we propose a framework that leverages Eigen-entropy along with a cumulative moving window to derive time series signatures to support the classification task. These signatures are enumerations of correlations among different time series considering the temporal nature of the dataset. To manage dataset’s dynamic nature, we employ preprocessing with dense multi scale entropy. Consequently, the proposed framework, Eigen-entropy-based Time Series Signatures, captures correlations among multivariate time series without losing its temporal and dynamic aspects. The efficacy of our algorithm is assessed using six binary datasets sourced from the University of East Anglia, in addition to a publicly available gait dataset and an institutional sepsis dataset from the Mayo Clinic. We use recall as the evaluation metric to compare our approach against baseline algorithms, including dependent dynamic time warping with 1 nearest neighbor and multivariate multi-scale permutation entropy. Our method demonstrates superior performance in terms of recall for seven out of the eight datasets.more » « less
An official website of the United States government
