skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: On causal discovery with convergent cross mapping
Convergent cross mapping is a principled causal discovery technique for signals, but its efficacy depends on a number of assumptions about the systems that generated the signals. We present a self-contained introduction to the theory of causality in state-space models, Takens’ theorem, and cross maps, and we propose conditions to check if a signal is appropriate for cross mapping. Further, we propose simple analyses based on Gaussian processes to test for these conditions in data. We show that our proposed techniques detect when convergent cross mapping may conclude erroneous results using several examples from the literature, and we comment on other considerations that are important when applying methods such as convergent cross mapping.  more » « less
Award ID(s):
2212506
PAR ID:
10417053
Author(s) / Creator(s):
Editor(s):
Dr. Wing-Kin 
Date Published:
Journal Name:
IEEE transactions on signal processing
ISSN:
1053-587X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Natural systems exhibit diverse behavior generated by complex interactions between their constituent parts. To characterize these interactions, we introduce Convergent Cross Sorting (CCS), a novel algorithm based on convergent cross mapping (CCM) for estimating dynamic coupling from time series data. CCS extends CCM by using the relative ranking of distances within state-space reconstructions to improve the prior methods’ performance at identifying the existence, relative strength, and directionality of coupling across a wide range of signal and noise characteristics. In particular, relative to CCM, CCS has a large performance advantage when analyzing very short time series data and data from continuous dynamical systems with synchronous behavior. This advantage allows CCS to better uncover the temporal and directional relationships within systems that undergo frequent and short-lived switches in dynamics, such as neural systems. In this paper, we validate CCS on simulated data and demonstrate its applicability to electrophysiological recordings from interacting brain regions. 
    more » « less
  2. Rooted in dynamic systems theory, convergent cross mapping (CCM) has attracted increased attention recently due to its capability in detecting linear and nonlinear causal coupling in both random and deterministic settings. One limitation with CCM is that it uses both past and future values to predict the current value, which is inconsistent with the widely accepted definition of causality, where it is assumed that the future values of one process cannot influence the past of another. To overcome this obstacle, in our previous research, we introduced the concept of causalized convergent cross mapping (cCCM), where future values are no longer used to predict the current value. In this paper, we focus on the implementation of cCCM in causality analysis. More specifically, we demonstrate the effectiveness of cCCM in identifying both linear and nonlinear causal coupling in various settings through a large number of examples, including Gaussian random variables with additive noise, sinusoidal waveforms, autoregressive models, stochastic processes with a dominant spectral component embedded in noise, deterministic chaotic maps, and systems with memory, as well as experimental fMRI data. In particular, we analyze the impact of shadow manifold construction on the performance of cCCM and provide detailed guidelines on how to configure the key parameters of cCCM in different applications. Overall, our analysis indicates that cCCM is a promising and easy-to-implement tool for causality analysis in a wide spectrum of applications.

     
    more » « less
  3. Abstract Motivation Oxford Nanopore Technologies sequencing devices support adaptive sequencing, in which undesired reads can be ejected from a pore in real time. This feature allows targeted sequencing aided by computational methods for mapping partial reads, rather than complex library preparation protocols. However, existing mapping methods either require a computationally expensive base-calling procedure before using aligners to map partial reads or work well only on small genomes. Results In this work, we present a new streaming method that can map nanopore raw signals for real-time selective sequencing. Rather than converting read signals to bases, we propose to convert reference genomes to signals and fully operate in the signal space. Our method features a new way to index reference genomes using k-d trees, a novel seed selection strategy and a seed chaining algorithm tailored toward the current signal characteristics. We implemented the method as a tool Sigmap. Then we evaluated it on both simulated and real data and compared it to the state-of-the-art nanopore raw signal mapper Uncalled. Our results show that Sigmap yields comparable performance on mapping yeast simulated raw signals, and better mapping accuracy on mapping yeast real raw signals with a 4.4× speedup. Moreover, our method performed well on mapping raw signals to genomes of size >100 Mbp and correctly mapped 11.49% more real raw signals of green algae, which leads to a significantly higher F1-score (0.9354 versus 0.8660). Availability and implementation Sigmap code is accessible at https://github.com/haowenz/sigmap. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  4. Objective: Inferring causal or effective connectivity between measured timeseries is crucial to understanding directed interactions in complex systems. This task is especially challenging in the brain as the underlying dynamics are not well-understood. This paper aims to introduce a novel causality measure called frequency-domain convergent cross-mapping (FDCCM) that utilizes frequency-domain dynamics through nonlinear state-space reconstruction. Method: Using synthesized chaotic timeseries, we investigate general applicability of FDCCM at different causal strengths and noise levels. We also apply our method on two resting-state Parkinson's datasets with 31 and 54 subjects, respectively. To this end, we construct causal networks, extract network features, and perform machine learning analysis to distinguish Parkinson's disease patients (PD) from age and gender-matched healthy controls (HC). Specifically, we use the FDCCM networks to compute the betweenness centrality of the network nodes, which act as features for the classification models. Result: The analysis on simulated data showed that FDCCM is resilient to additive Gaussian noise, making it suitable for real-world applications. Our proposed method also decodes scalp-EEG signals to classify the PD and HC groups with approximately 97% leave-one-subject-out cross-validation accuracy. We compared decoders from six cortical regions to find that features derived from the left temporal lobe lead to a higher classification accuracy of 84.5% compared to other regions. Moreover, when the classifier trained using FDCCM networks from one dataset was tested on an independent out-of-sample dataset, it attained an accuracy of 84%. This accuracy is significantly higher than correlational networks (45.2%) and CCM networks (54.84%). Significance: These findings suggest that our spectral-based causality measure can improve classification performance and reveal useful network biomarkers of Parkinson's disease. 
    more » « less
  5. Abbott, Derek (Ed.)
    Abstract

    Convergent cross-mapping (CCM) has attracted increased attention recently due to its capability to detect causality in nonseparable systems under deterministic settings, which may not be covered by the traditional Granger causality. From an information-theoretic perspective, causality is often characterized as the directed information (DI) flowing from one side to the other. As information is essentially nondeterministic, a natural question is: does CCM measure DI flow? Here, we first causalize CCM so that it aligns with the presumption in causality analysis—the future values of one process cannot influence the past of the other, and then establish and validate the approximate equivalence of causalized CCM (cCCM) and DI under Gaussian variables through both theoretical derivations and fMRI-based brain network causality analysis. Our simulation result indicates that, in general, cCCM tends to be more robust than DI in causality detection. The underlying argument is that DI relies heavily on probability estimation, which is sensitive to data size as well as digitization procedures; cCCM, on the other hand, gets around this problem through geometric cross-mapping between the manifolds involved. Overall, our analysis demonstrates that cross-mapping provides an alternative way to evaluate DI and is potentially an effective technique for identifying both linear and nonlinear causal coupling in brain neural networks and other settings, either random or deterministic, or both.

     
    more » « less