skip to main content

This content will become publicly available on October 1, 2024

Title: Causal Deep Operator Networks for Data-Driven Modeling of Dynamical Systems
The deep operator network (DeepONet) architecture is a promising approach for learning functional operators, that can represent dynamical systems described by ordinary or partial differential equations. However, it has two major limitations, namely its failures to account for initial conditions and to guarantee the temporal causality – a fundamental property of dynamical systems. This paper proposes a novel causal deep operator network (Causal-DeepONet) architecture for incorporating both the initial condition and the temporal causality into data-driven learning of dynamical systems, overcoming the limitations of the original DeepONet approach. This is achieved by adding an independent root network for the initial condition and independent branch networks conditioned, or switched on/off, by time-shifted step functions or sigmoid functions for expressing the temporal causality. The proposed architecture was evaluated and compared with two baseline deep neural network methods and the original DeepONet method on learning the thermal dynamics of a room in a building using real data. It was shown to not only achieve the best overall prediction accuracy but also enhance substantially the accuracy consistency in multistep predictions, which is crucial for predictive control.  more » « less
Award ID(s):
2138388 2238296
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Date Published:
Journal Name:
Proceeding of the IEEE International Conference on Systems, Man, and Cybernetics
Page Range / eLocation ID:
1136 to 1141
Medium: X
Honolulu, Oahu, HI, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. The Deep Operator Network (DeepONet) framework is a different class of neural network architecture that one trains to learn nonlinear operators, i.e., mappings between infinite-dimensional spaces. Traditionally, DeepONets are trained using a centralized strategy that requires transferring the training data to a centralized location. Such a strategy, however, limits our ability to secure data privacy or use high-performance distributed/parallel computing platforms. To alleviate such limitations, in this paper, we study the federated training of DeepONets for the first time. That is, we develop a framework, which we refer to as Fed-DeepONet, that allows multiple clients to train DeepONets collaboratively under the coordination of a centralized server. To achieve Fed-DeepONets, we propose an efficient stochastic gradient-based algorithm that enables the distributed optimization of the DeepONet parameters by averaging first-order estimates of the DeepONet loss gradient. Then, to accelerate the training convergence of Fed-DeepONets, we propose a moment-enhanced (i.e., adaptive) stochastic gradient-based strategy. Finally, we verify the performance of Fed-DeepONet by learning, for different configurations of the number of clients and fractions of available clients, (i) the solution operator of a gravity pendulum and (ii) the dynamic response of a parametric library of pendulums. 
    more » « less
  2. Deep operator network (DeepONet) has demonstrated great success in various learning tasks, including learning solution operators of partial differential equations. In particular, it pro- vides an efficient approach to predict the evolution equations in a finite time horizon. Nevertheless, the vanilla DeepONet suffers from the issue of stability degradation in the long- time prediction. This paper proposes a transfer-learning aided DeepONet to enhance the stability. Our idea is to use transfer learning to sequentially update the DeepONets as the surro- gates for propagators learned in different time frames. The evolving DeepONets can better track the varying complexities of the evolution equations, while only need to be updated by efficient training of a tiny fraction of the operator networks. Through systematic experiments, we show that the proposed method not only improves the long-time accuracy of Deep- ONet while maintaining similar computational cost but also substantially reduces the sample size of the training set. 
    more » « less
  3. Abstract

    This paper focuses on the feasibility of deep neural operator network (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) enabling technology for nuclear energy systems. Machine learning (ML)-based prediction algorithms that need extensive retraining for new reactor operational conditions may prohibit real-time inference for DT across varying scenarios. In this study, DeepONet is trained with possible operational conditions and that relaxes the requirement of continuous retraining - making it suitable for online and real-time prediction components for DT. Through benchmarking and evaluation, DeepONet exhibits remarkable prediction accuracy and speed, outperforming traditional ML methods, making it a suitable algorithm for real-time DT inference in solving a challenging particle transport problem. DeepONet also exhibits generalizability and computational efficiency as an efficient surrogate tool for DT component. However, the application of DeepONet reveals challenges related to optimal sensor placement and model evaluation, critical aspects of real-world DT implementation. Addressing these challenges will further enhance the method’s practicality and reliability. Overall, this study marks an important step towards harnessing the power of DeepONet surrogate modeling for real-time inference capability within the context of DT enabling technology for nuclear systems.

    more » « less
  4. Abstract

    Climate models are essential to understand and project climate change, yet long‐standing biases and uncertainties in their projections remain. This is largely associated with the representation of subgrid‐scale processes, particularly clouds and convection. Deep learning can learn these subgrid‐scale processes from computationally expensive storm‐resolving models while retaining many features at a fraction of computational cost. Yet, climate simulations with embedded neural network parameterizations are still challenging and highly depend on the deep learning solution. This is likely associated with spurious non‐physical correlations learned by the neural networks due to the complexity of the physical dynamical system. Here, we show that the combination of causality with deep learning helps removing spurious correlations and optimizing the neural network algorithm. To resolve this, we apply a causal discovery method to unveil causal drivers in the set of input predictors of atmospheric subgrid‐scale processes of a superparameterized climate model in which deep convection is explicitly resolved. The resulting causally‐informed neural networks are coupled to the climate model, hence, replacing the superparameterization and radiation scheme. We show that the climate simulations with causally‐informed neural network parameterizations retain many convection‐related properties and accurately generate the climate of the original high‐resolution climate model, while retaining similar generalization capabilities to unseen climates compared to the non‐causal approach. The combination of causal discovery and deep learning is a new and promising approach that leads to stable and more trustworthy climate simulations and paves the way toward more physically‐based causal deep learning approaches also in other scientific disciplines.

    more » « less
  5. Abstract

    The deep operator network (DeepONet) structure has shown great potential in approximating complex solution operators with low generalization errors. Recently, a sequential DeepONet (S-DeepONet) was proposed to use sequential learning models in the branch of DeepONet to predict final solutions given time-dependent inputs. In the current work, the S-DeepONet architecture is extended by modifying the information combination mechanism between the branch and trunk networks to simultaneously predict vector solutions with multiple components at multiple time steps of the evolution history, which is the first in the literature using DeepONets. Two example problems, one on transient fluid flow and the other on path-dependent plastic loading, were shown to demonstrate the capabilities of the model to handle different physics problems. The use of a trained S-DeepONet model in inverse parameter identification via the genetic algorithm is shown to demonstrate the application of the model. In almost all cases, the trained model achieved an$$R^2$$R2value of above 0.99 and a relative$$L_2$$L2error of less than 10% with only 3200 training data points, indicating superior accuracy. The vector S-DeepONet model, having only 0.4% more parameters than a scalar model, can predict two output components simultaneously at an accuracy similar to the two independently trained scalar models with a 20.8% faster training time. The S-DeepONet inference is at least three orders of magnitude faster than direct numerical simulations, and inverse parameter identifications using the trained model are highly efficient and accurate.

    more » « less