skip to main content


Title: Optimizing differential equations to fit data and predict outcomes
Abstract

Many scientific problems focus on observed patterns of change or on how to design a system to achieve particular dynamics. Those problems often require fitting differential equation models to target trajectories. Fitting such models can be difficult because each evaluation of the fit must calculate the distance between the model and target patterns at numerous points along a trajectory. The gradient of the fit with respect to the model parameters can be challenging to compute. Recent technical advances in automatic differentiation through numerical differential equation solvers potentially change the fitting process into a relatively easy problem, opening up new possibilities to study dynamics. However, application of the new tools to real data may fail to achieve a good fit. This article illustrates how to overcome a variety of common challenges, using the classic ecological data for oscillations in hare and lynx populations. Models include simple ordinary differential equations (ODEs) and neural ordinary differential equations (NODEs), which use artificial neural networks to estimate the derivatives of differential equation systems. Comparing the fits obtained with ODEs versus NODEs, representing small and large parameter spaces, and changing the number of variable dimensions provide insight into the geometry of the observed and model trajectories. To analyze the quality of the models for predicting future observations, a Bayesian‐inspired preconditioned stochastic gradient Langevin dynamics (pSGLD) calculation of the posterior distribution of predicted model trajectories clarifies the tendency for various models to underfit or overfit the data. Coupling fitted differential equation systems with pSGLD sampling provides a powerful way to study the properties of optimization surfaces, raising an analogy with mutation‐selection dynamics on fitness landscapes.

 
more » « less
Award ID(s):
1939423
NSF-PAR ID:
10419246
Author(s) / Creator(s):
 
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Ecology and Evolution
Volume:
13
Issue:
3
ISSN:
2045-7758
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modeling physiochemical relationships using dynamic data is a common task in fields throughout science and engineering. A common step in developing generalizable, mechanistic models is to fit unmeasured parameters to measured data. However, fitting differential equation-based models can be computation-intensive and uncertain due to the presence of nonlinearity, noise, and sparsity in the data, which in turn causes convergence to local minima and divergence issues. This work proposes a merger of machine learning (ML) and mechanistic approaches by employing ML models as a means to fit nonlinear mechanistic ordinary differential equations (ODEs). Using a two-stage indirect approach, neural ODEs are used to estimate state derivatives, which are then used to estimate the parameters of a more interpretable mechanistic ODE model. In addition to its computational efficiency, the proposed method demonstrates the ability of neural ODEs to better estimate derivative information than interpolating methods based on algebraic data-driven models. Most notably, the proposed method is shown to yield accurate predictions even when little information is known about the parameters of the ODEs. The proposed parameter estimation approach is believed to be most advantageous when the ODE to be fit is strongly nonlinear with respect to its unknown parameters. 
    more » « less
  2. In this paper, we compare the performance between systems of ordinary and (Caputo) fractional differential equations depicting the susceptible-exposed-infectious-recovered (SEIR) models of diseases. In order to understand the origins of both approaches as mean-field approximations of integer and fractional stochastic processes, we introduce the fractional differential equations (FDEs) as approximations of some type of fractional nonlinear birth and death processes. Then, we examine validity of the two approaches against empirical courses of epidemics; we fit both of them to case counts of three measles epidemics that occurred during the pre-vaccination era in three different locations. While ordinary differential equations (ODEs) are commonly used to model epidemics, FDEs are more flexible in fitting empirical data and theoretically offer improved model predictions. The question arises whether, in practice, the benefits of using FDEs over ODEs outweigh the added computational complexities. While important differences in transient dynamics were observed, the FDE only outperformed the ODE in one of out three data sets. In general, FDE modeling approaches may be worth it in situations with large refined data sets and good numerical algorithms. 
    more » « less
  3. We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly reducing the number of function evaluations (NFEs) and improving the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling. Our method requires remarkably fewer forward and backward NFEs, is more accurate, and learns long-term dependencies more effectively than the other ODE-based neural network models. Code is available at https://github.com/hedixia/HeavyBallNODE. 
    more » « less
  4. Abstract

    The method of choice for integrating the time-dependent Fokker–Planck equation (FPE) in high-dimension is to generate samples from the solution via integration of the associated stochastic differential equation (SDE). Here, we study an alternative scheme based on integrating an ordinary differential equation that describes the flow of probability. Acting as a transport map, this equation deterministically pushes samples from the initial density onto samples from the solution at any later time. Unlike integration of the stochastic dynamics, the method has the advantage of giving direct access to quantities that are challenging to estimate from trajectories alone, such as the probability current, the density itself, and its entropy. The probability flow equation depends on the gradient of the logarithm of the solution (its ‘score’), and so isa-prioriunknown. To resolve this dependence, we model the score with a deep neural network that is learned on-the-fly by propagating a set of samples according to the instantaneous probability current. We show theoretically that the proposed approach controls the Kullback–Leibler (KL) divergence from the learned solution to the target, while learning on external samples from the SDE does not control either direction of the KL divergence. Empirically, we consider several high-dimensional FPEs from the physics of interacting particle systems. We find that the method accurately matches analytical solutions when they are available as well as moments computed via Monte-Carlo when they are not. Moreover, the method offers compelling predictions for the global entropy production rate that out-perform those obtained from learning on stochastic trajectories, and can effectively capture non-equilibrium steady-state probability currents over long time intervals.

     
    more » « less
  5. Learning multi-agent system dynamics has been extensively studied for various real-world applications, such as molecular dynamics in biology, multi-body system in physics, and particle dynamics in material science. Most of the existing models are built to learn single system dynamics, which learn the dynamics from observed historical data and predict the future trajectory. In practice, however, we might observe multiple systems that are generated across different environments, which differ in latent exogenous factors such as temperature and gravity. One simple solution is to learn multiple environment-specific models, but it fails to exploit the potential commonalities among the dynamics across environments and offers poor prediction results where per-environment data is sparse or limited. Here, we present GG-ODE (Generalized Graph Ordinary Differential Equations), a machine learning framework for learning continuous multi-agent system dynamics across environments. Our model learns system dynamics using neural ordinary differential equations (ODE) parameterized by Graph Neural Networks (GNNs) to capture the continuous interaction among agents. We achieve the model generalization by assuming the dynamics across different environments are governed by common physics laws that can be captured via learning a shared ODE function. The distinct latent exogenous factors learned for each environment are incorporated into the ODE function to account for their differences. To improve model performance, we additionally design two regularization losses to (1) enforce the orthogonality between the learned initial states and exogenous factors via mutual information minimization; and (2) reduce the temporal variance of learned exogenous factors within the same system via contrastive learning. Experiments over various physical simulations show that our model can accurately predict system dynamics, especially in the long range, and can generalize well to new systems with few observations. 
    more » « less