Modeling physiochemical relationships using dynamic data is a common task in fields throughout science and engineering. A common step in developing generalizable, mechanistic models is to fit unmeasured parameters to measured data. However, fitting differential equation-based models can be computation-intensive and uncertain due to the presence of nonlinearity, noise, and sparsity in the data, which in turn causes convergence to local minima and divergence issues. This work proposes a merger of machine learning (ML) and mechanistic approaches by employing ML models as a means to fit nonlinear mechanistic ordinary differential equations (ODEs). Using a two-stage indirect approach, neural ODEs are used to estimate state derivatives, which are then used to estimate the parameters of a more interpretable mechanistic ODE model. In addition to its computational efficiency, the proposed method demonstrates the ability of neural ODEs to better estimate derivative information than interpolating methods based on algebraic data-driven models. Most notably, the proposed method is shown to yield accurate predictions even when little information is known about the parameters of the ODEs. The proposed parameter estimation approach is believed to be most advantageous when the ODE to be fit is strongly nonlinear with respect to its unknown parameters.
more »
« less
Optimizing differential equations to fit data and predict outcomes
Abstract Many scientific problems focus on observed patterns of change or on how to design a system to achieve particular dynamics. Those problems often require fitting differential equation models to target trajectories. Fitting such models can be difficult because each evaluation of the fit must calculate the distance between the model and target patterns at numerous points along a trajectory. The gradient of the fit with respect to the model parameters can be challenging to compute. Recent technical advances in automatic differentiation through numerical differential equation solvers potentially change the fitting process into a relatively easy problem, opening up new possibilities to study dynamics. However, application of the new tools to real data may fail to achieve a good fit. This article illustrates how to overcome a variety of common challenges, using the classic ecological data for oscillations in hare and lynx populations. Models include simple ordinary differential equations (ODEs) and neural ordinary differential equations (NODEs), which use artificial neural networks to estimate the derivatives of differential equation systems. Comparing the fits obtained with ODEs versus NODEs, representing small and large parameter spaces, and changing the number of variable dimensions provide insight into the geometry of the observed and model trajectories. To analyze the quality of the models for predicting future observations, a Bayesian‐inspired preconditioned stochastic gradient Langevin dynamics (pSGLD) calculation of the posterior distribution of predicted model trajectories clarifies the tendency for various models to underfit or overfit the data. Coupling fitted differential equation systems with pSGLD sampling provides a powerful way to study the properties of optimization surfaces, raising an analogy with mutation‐selection dynamics on fitness landscapes.
more »
« less
- Award ID(s):
- 1939423
- PAR ID:
- 10419246
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Ecology and Evolution
- Volume:
- 13
- Issue:
- 3
- ISSN:
- 2045-7758
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we compare the performance between systems of ordinary and (Caputo) fractional differential equations depicting the susceptible-exposed-infectious-recovered (SEIR) models of diseases. In order to understand the origins of both approaches as mean-field approximations of integer and fractional stochastic processes, we introduce the fractional differential equations (FDEs) as approximations of some type of fractional nonlinear birth and death processes. Then, we examine validity of the two approaches against empirical courses of epidemics; we fit both of them to case counts of three measles epidemics that occurred during the pre-vaccination era in three different locations. While ordinary differential equations (ODEs) are commonly used to model epidemics, FDEs are more flexible in fitting empirical data and theoretically offer improved model predictions. The question arises whether, in practice, the benefits of using FDEs over ODEs outweigh the added computational complexities. While important differences in transient dynamics were observed, the FDE only outperformed the ODE in one of out three data sets. In general, FDE modeling approaches may be worth it in situations with large refined data sets and good numerical algorithms.more » « less
-
We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly reducing the number of function evaluations (NFEs) and improving the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling. Our method requires remarkably fewer forward and backward NFEs, is more accurate, and learns long-term dependencies more effectively than the other ODE-based neural network models. Code is available at https://github.com/hedixia/HeavyBallNODE.more » « less
-
The main mathematical result in this paper is that change of variables in the ordinary differential equation (ODE) for the competition of two infections in a Susceptible–Infected–Removed (SIR) model shows that the fraction of cases due to the new variant satisfies the logistic differential equation, which models selective sweeps. Fitting the logistic to data from the Global Initiative on Sharing All Influenza Data (GISAID) shows that this correctly predicts the rapid turnover from one dominant variant to another. In addition, our fitting gives sensible estimates of the increase in infectivity. These arguments are applicable to any epidemic modeled by SIR equations.more » « less
-
Malek-Madani, Reza (Ed.)This short, self-contained article seeks to introduce and survey continuous-time deep learning approaches that are based on neural ordinary differential equations (neural ODEs). It primarily targets readers familiar with ordinary and partial differential equations and their analysis who are curious to see their role in machine learning. Using three examples from machine learning and applied mathematics, we will see how neural ODEs can provide new insights into deep learning and a foundation for more efficient algorithms.more » « less
An official website of the United States government
