skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hybrid^2 Neural ODE Causal Modeling and an Application to Glycemic Response
Hybrid models composing mechanistic ODE- based dynamics with flexible and expressive neural network components have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standard blackbox modeling approaches, critical when learning from small datasets or partially observed, complex systems. Unfortunately, as the hybrid models become more flexible, the causal grounding provided by the mechanistic model can quickly be lost. We address this problem by leveraging another common source of domain knowledge: ranking of treatment effects for a set of interventions, even if the precise treatment effect is unknown. We encode this information in a causal loss that we combine with the standard predictive loss to arrive at a hybrid loss that biases our learning towards causally valid hybrid models. We demonstrate our ability to achieve a win-win, state-of-the-art predictive performance and causal validity, in the challenging task of modeling glucose dynamics post-exercise in individuals with type 1 diabetes.  more » « less
Award ID(s):
2205084
PAR ID:
10532316
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
41st International Conference on Machine Learning (ICML)
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modeling the non-linear dynamics of a system from measurement data accurately is an open challenge. Over the past few years, various tools such as SINDy and DySMHO have emerged as approaches to distill dynamics from data. However, challenges persist in accurately capturing dynamics of a system especially when the physical knowledge about the system is unknown. A promising solution is to use a hybrid paradigm, that combines mechanistic and black-box models to leverage their respective strengths. In this study, we combine a hybrid modeling paradigm with sparse regression, to develop and identify models simultaneously. Two methods are explored, considering varying complexities, data quality, and availability and by comparing different case studies. In the first approach, we integrate SINDy-discovered models with neural ODE structures, to model unknown physics. In the second approach, we employ Multifidelity Surrogate Models (MFSMs) to construct composite models comprised of SINDy-discovered models and error-correction models. 
    more » « less
  2. We introduce EINNs, a framework crafted for epidemic forecasting that builds upon the theoretical grounds provided by mechanistic models as well as the data-driven expressibility afforded by AI models, and their capabilities to ingest heterogeneous information. Although neural forecasting models have been successful in multiple tasks, predictions well-correlated with epidemic trends and long-term predictions remain open challenges. Epidemiological ODE models contain mechanisms that can guide us in these two tasks; however, they have limited capability of ingesting data sources and modeling composite signals. Thus, we propose to leverage work in physics-informed neural networks to learn latent epidemic dynamics and transfer relevant knowledge to another neural network which ingests multiple data sources and has more appropriate inductive bias. In contrast with previous work, we do not assume the observability of complete dynamics and do not need to numerically solve the ODE equations during training. Our thorough experiments on all US states and HHS regions for COVID-19 and influenza forecasting showcase the clear benefits of our approach in both short-term and long-term forecasting as well as in learning the mechanistic dynamics over other non-trivial alternatives. 
    more » « less
  3. The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless. First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of T T , the time-interval over which training data is specified. Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features. Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain- interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models. 
    more » « less
  4. Modeling physiochemical relationships using dynamic data is a common task in fields throughout science and engineering. A common step in developing generalizable, mechanistic models is to fit unmeasured parameters to measured data. However, fitting differential equation-based models can be computation-intensive and uncertain due to the presence of nonlinearity, noise, and sparsity in the data, which in turn causes convergence to local minima and divergence issues. This work proposes a merger of machine learning (ML) and mechanistic approaches by employing ML models as a means to fit nonlinear mechanistic ordinary differential equations (ODEs). Using a two-stage indirect approach, neural ODEs are used to estimate state derivatives, which are then used to estimate the parameters of a more interpretable mechanistic ODE model. In addition to its computational efficiency, the proposed method demonstrates the ability of neural ODEs to better estimate derivative information than interpolating methods based on algebraic data-driven models. Most notably, the proposed method is shown to yield accurate predictions even when little information is known about the parameters of the ODEs. The proposed parameter estimation approach is believed to be most advantageous when the ODE to be fit is strongly nonlinear with respect to its unknown parameters. 
    more » « less
  5. Abstract Alzheimer’s disease (AD) is believed to occur when abnormal amounts of the proteins amyloid beta and tau aggregate in the brain, resulting in a progressive loss of neuronal function. Hippocampal neurons in transgenic mice with amyloidopathy or tauopathy exhibit altered intrinsic excitability properties. We used deep hybrid modeling (DeepHM), a recently developed parameter inference technique that combines deep learning with biophysical modeling, to map experimental data recorded from hippocampal CA1 neurons in transgenic AD mice and age-matched wildtype littermate controls to the parameter space of a conductance-based CA1 model. Although mechanistic modeling and machine learning methods are by themselves powerful tools for approximating biological systems and making accurate predictions from data, when used in isolation these approaches suffer from distinct shortcomings: model and parameter uncertainty limit mechanistic modeling, whereas machine learning methods disregard the underlying biophysical mechanisms. DeepHM addresses these shortcomings by using conditional generative adversarial networks to provide an inverse mapping of data to mechanistic models that identifies the distributions of mechanistic modeling parameters coherent to the data. Here, we demonstrated that DeepHM accurately infers parameter distributions of the conductance-based model on several test cases using synthetic data generated with complex underlying parameter structures. We then used DeepHM to estimate parameter distributions corresponding to the experimental data and infer which ion channels are altered in the Alzheimer’s mouse models compared to their wildtype controls at 12 and 24 months. We found that the conductances most disrupted by tauopathy, amyloidopathy, and aging are delayed rectifier potassium, transient sodium, and hyperpolarization-activated potassium, respectively. 
    more » « less