skip to main content


This content will become publicly available on October 1, 2024

Title: Causal Deep Operator Networks for Data-Driven Modeling of Dynamical Systems
The deep operator network (DeepONet) architecture is a promising approach for learning functional operators, that can represent dynamical systems described by ordinary or partial differential equations. However, it has two major limitations, namely its failures to account for initial conditions and to guarantee the temporal causality – a fundamental property of dynamical systems. This paper proposes a novel causal deep operator network (Causal-DeepONet) architecture for incorporating both the initial condition and the temporal causality into data-driven learning of dynamical systems, overcoming the limitations of the original DeepONet approach. This is achieved by adding an independent root network for the initial condition and independent branch networks conditioned, or switched on/off, by time-shifted step functions or sigmoid functions for expressing the temporal causality. The proposed architecture was evaluated and compared with two baseline deep neural network methods and the original DeepONet method on learning the thermal dynamics of a room in a building using real data. It was shown to not only achieve the best overall prediction accuracy but also enhance substantially the accuracy consistency in multistep predictions, which is crucial for predictive control.  more » « less
Award ID(s):
2138388 2238296
NSF-PAR ID:
10491592
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
Proceeding of the IEEE International Conference on Systems, Man, and Cybernetics
ISSN:
2577-1655
Page Range / eLocation ID:
1136 to 1141
Format(s):
Medium: X
Location:
Honolulu, Oahu, HI, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Objective: Inferring causal or effective connectivity between measured timeseries is crucial to understanding directed interactions in complex systems. This task is especially challenging in the brain as the underlying dynamics are not well-understood. This paper aims to introduce a novel causality measure called frequency-domain convergent cross-mapping (FDCCM) that utilizes frequency-domain dynamics through nonlinear state-space reconstruction. Method: Using synthesized chaotic timeseries, we investigate general applicability of FDCCM at different causal strengths and noise levels. We also apply our method on two resting-state Parkinson's datasets with 31 and 54 subjects, respectively. To this end, we construct causal networks, extract network features, and perform machine learning analysis to distinguish Parkinson's disease patients (PD) from age and gender-matched healthy controls (HC). Specifically, we use the FDCCM networks to compute the betweenness centrality of the network nodes, which act as features for the classification models. Result: The analysis on simulated data showed that FDCCM is resilient to additive Gaussian noise, making it suitable for real-world applications. Our proposed method also decodes scalp-EEG signals to classify the PD and HC groups with approximately 97% leave-one-subject-out cross-validation accuracy. We compared decoders from six cortical regions to find that features derived from the left temporal lobe lead to a higher classification accuracy of 84.5% compared to other regions. Moreover, when the classifier trained using FDCCM networks from one dataset was tested on an independent out-of-sample dataset, it attained an accuracy of 84%. This accuracy is significantly higher than correlational networks (45.2%) and CCM networks (54.84%). Significance: These findings suggest that our spectral-based causality measure can improve classification performance and reveal useful network biomarkers of Parkinson's disease. 
    more » « less
  2. Abstract

    Climate models are essential to understand and project climate change, yet long‐standing biases and uncertainties in their projections remain. This is largely associated with the representation of subgrid‐scale processes, particularly clouds and convection. Deep learning can learn these subgrid‐scale processes from computationally expensive storm‐resolving models while retaining many features at a fraction of computational cost. Yet, climate simulations with embedded neural network parameterizations are still challenging and highly depend on the deep learning solution. This is likely associated with spurious non‐physical correlations learned by the neural networks due to the complexity of the physical dynamical system. Here, we show that the combination of causality with deep learning helps removing spurious correlations and optimizing the neural network algorithm. To resolve this, we apply a causal discovery method to unveil causal drivers in the set of input predictors of atmospheric subgrid‐scale processes of a superparameterized climate model in which deep convection is explicitly resolved. The resulting causally‐informed neural networks are coupled to the climate model, hence, replacing the superparameterization and radiation scheme. We show that the climate simulations with causally‐informed neural network parameterizations retain many convection‐related properties and accurately generate the climate of the original high‐resolution climate model, while retaining similar generalization capabilities to unseen climates compared to the non‐causal approach. The combination of causal discovery and deep learning is a new and promising approach that leads to stable and more trustworthy climate simulations and paves the way toward more physically‐based causal deep learning approaches also in other scientific disciplines.

     
    more » « less
  3. Time-evolution of partial differential equations is fundamental for modeling several complex dynamical processes and events forecasting, but the operators associated with such problems are non-linear. We propose a Pad´e approximation based exponential neural operator scheme for efficiently learning the map between a given initial condition and the activities at a later time. The multiwavelets bases are used for space discretization. By explicitly embedding the exponential operators in the model, we reduce the training parameters and make it more data-efficient which is essential in dealing with scarce and noisy real-world datasets. The Pad´e exponential operator uses a recurrent structure with shared parameters to model the non-linearity compared to recent neural operators that rely on using multiple linear operator layers in succession. We show theoretically that the gradients associated with the recurrent Pad´e network are bounded across the recurrent horizon. We perform experiments on non-linear systems such as Korteweg-de Vries (KdV) and Kuramoto–Sivashinsky (KS) equations to show that the proposed approach achieves the best performance and at the same time is data-efficient. We also show that urgent real-world problems like epidemic forecasting (for example, COVID- 19) can be formulated as a 2D time-varying operator problem. The proposed Pad´e exponential operators yield better prediction results (53% (52%) better MAE than best neural operator (non-neural operator deep learning model)) compared to state-of-the-art forecasting models. 
    more » « less
  4. Time-evolution of partial differential equations is the key to model several dynamical processes, events forecasting but the operators associated with such problems are non-linear. We propose a Padé approximation based exponential neural operator scheme for efficiently learning the map between a given initial condition and activities at a later time. The multiwavelets bases are used for space discretization. By explicitly embedding the exponential operators in the model, we reduce the training parameters and make it more data-efficient which is essential in dealing with scarce real-world datasets. The Padé exponential operator uses a to model the non-linearity compared to recent neural operators that rely on using multiple linear operator layers in succession. We show theoretically that the gradients associated with the recurrent Padé network are bounded across the recurrent horizon. We perform experiments on non-linear systems such as Korteweg-de Vries (KdV) and Kuramoto–Sivashinsky (KS) equations to show that the proposed approach achieves the best performance and at the same time is data-efficient. We also show that urgent real-world problems like Epidemic forecasting (for example, COVID-19) can be formulated as a 2D time-varying operator problem. The proposed Padé exponential operators yield better prediction results ( better MAE than best neural operator (non-neural operator deep learning model)) compared to state-of-the-art forecasting models. 
    more » « less
  5. Time-evolution of partial differential equations is fundamental for modeling several complex dynamical processes and events forecasting, but the operators associated with such problems are non-linear. We propose a Pad´e approximation based exponential neural operator scheme for efficiently learning the map between a given initial condition and the activities at a later time. The multiwavelets bases are used for space discretization. By explicitly embedding the exponential operators in the model, we reduce the training parameters and make it more data-efficient which is essential in dealing with scarce and noisy real-world datasets. The Pad´e exponential operator uses a recurrent structure with shared parameters to model the non-linearity compared to recent neural operators that rely on using multiple linear operator layers in succession. We show theoretically that the gradients associated with the recurrent Pad´e network are bounded across the recurrent horizon. We perform experiments on non-linear systems such as Korteweg-de Vries (KdV) and Kuramoto–Sivashinsky (KS) equations to show that the proposed approach achieves the best performance and at the same time is data-efficient. We also show that urgent real-world problems like epidemic forecasting (for example, COVID- 19) can be formulated as a 2D time-varying operator problem. The proposed Pad´e exponential operators yield better prediction results (53% (52%) better MAE than best neural operator (non-neural operator deep learning model)) compared to state-of-the-art forecasting models. 
    more » « less