skip to main content

This content will become publicly available on March 1, 2022

Title: Learning differential equation models from stochastic agent-based model simulations
Agent-based models provide a flexible framework that is frequently used for modelling many biological systems, including cell migration, molecular dynamics, ecology and epidemiology. Analysis of the model dynamics can be challenging due to their inherent stochasticity and heavy computational requirements. Common approaches to the analysis of agent-based models include extensive Monte Carlo simulation of the model or the derivation of coarse-grained differential equation models to predict the expected or averaged output from the agent-based model. Both of these approaches have limitations, however, as extensive computation of complex agent-based models may be infeasible, and coarse-grained differential equation models can fail to accurately describe model dynamics in certain parameter regimes. We propose that methods from the equation learning field provide a promising, novel and unifying approach for agent-based model analysis. Equation learning is a recent field of research from data science that aims to infer differential equation models directly from data. We use this tutorial to review how methods from equation learning can be used to learn differential equation models from agent-based model simulations. We demonstrate that this framework is easy to use, requires few model simulations, and accurately predicts model dynamics in parameter regions where coarse-grained differential equation models fail to more » do so. We highlight these advantages through several case studies involving two agent-based models that are broadly applicable to biological phenomena: a birth–death–migration model commonly used to explore cell biology experiments and a susceptible–infected–recovered model of infectious disease spread. « less
Authors:
 ;  ;  ;  
Award ID(s):
1838314
Publication Date:
NSF-PAR ID:
10302341
Journal Name:
Journal of The Royal Society Interface
Volume:
18
Issue:
176
ISSN:
1742-5662
Sponsoring Org:
National Science Foundation
More Like this
  1. Simulating the dynamics of ions near polarizable nanoparticles (NPs) using coarse-grained models is extremely challenging due to the need to solve the Poisson equation at every simulation timestep. Recently, a molecular dynamics (MD) method based on a dynamical optimization framework bypassed this obstacle by representing the polarization charge density as virtual dynamic variables and evolving them in parallel with the physical dynamics of ions. We highlight the computational gains accessible with the integration of machine learning (ML) methods for parameter prediction in MD simulations by demonstrating how they were realized in MD simulations of ions near polarizable NPs. An artificialmore »neural network–based regression model was integrated with MD simulation and predicted the optimal simulation timestep and optimization parameters characterizing the virtual system with 94.3% success. The ML-enabled auto-tuning of parameters generated accurate dynamics of ions for ≈ 10 million steps while improving the stability of the simulation by over an order of magnitude. The integration of ML-enhanced framework with hybrid Open Multi-Processing / Message Passing Interface (OpenMP/MPI) parallelization techniques reduced the computational time of simulating systems with thousands of ions and induced charges from thousands of hours to tens of hours, yielding a maximum speedup of ≈ 3 from ML-only acceleration and a maximum speedup of ≈ 600 from the combination of ML and parallel computing methods. Extraction of ionic structure in concentrated electrolytes near oil–water emulsions demonstrates the success of the method. The approach can be generalized to select optimal parameters in other MD applications and energy minimization problems.« less
  2. We developed coarse-grained models of spike proteins in SARS-CoV-2 coronavirus and angiotensin-converting enzyme 2 (ACE2) receptor proteins to study the endocytosis of a whole coronavirus under physiologically relevant spatial and temporal scales. We first conducted all-atom explicit-solvent molecular dynamics simulations of the recently characterized structures of spike and ACE2 proteins. We then established coarse-grained models using the shape-based coarse-graining approach based on the protein crystal structures and extracted the force field parameters from the all-atom simulation trajectories. To further analyze the coarse-grained models, we carried out normal mode analysis of the coarse-grained models to refine the force field parameters bymore »matching the fluctuations of the internal coordinates with the original all-atom simulations. Finally, we demonstrated the capability of these coarse-grained models by simulating the endocytosis of a whole coronavirus through the host cell membrane. We embedded the coarse-grained models of spikes on the surface of the virus envelope and anchored ACE2 receptors on the host cell membrane, which is modeled using a one-particle-thick lipid bilayer model. The coarse-grained simulations show the spike proteins adopt bent configurations due to their unique flexibility during their interaction with the ACE2 receptors, which makes it easier for them to attach to the host cell membrane than rigid spikes.« less
  3. Abstract Background

    The biophysics of an organism span multiple scales from subcellular to organismal and include processes characterized by spatial properties, such as the diffusion of molecules, cell migration, and flow of intravenous fluids. Mathematical biology seeks to explain biophysical processes in mathematical terms at, and across, all relevant spatial and temporal scales, through the generation of representative models. While non-spatial, ordinary differential equation (ODE) models are often used and readily calibrated to experimental data, they do not explicitly represent the spatial and stochastic features of a biological system, limiting their insights and applications. However, spatial models describing biological systems withmore »spatial information are mathematically complex and computationally expensive, which limits the ability to calibrate and deploy them and highlights the need for simpler methods able to model the spatial features of biological systems.

    Results

    In this work, we develop a formal method for deriving cell-based, spatial, multicellular models from ODE models of population dynamics in biological systems, and vice versa. We provide examples of generating spatiotemporal, multicellular models from ODE models of viral infection and immune response. In these models, the determinants of agreement of spatial and non-spatial models are the degree of spatial heterogeneity in viral production and rates of extracellular viral diffusion and decay. We show how ODE model parameters can implicitly represent spatial parameters, and cell-based spatial models can generate uncertain predictions through sensitivity to stochastic cellular events, which is not a feature of ODE models. Using our method, we can test ODE models in a multicellular, spatial context and translate information to and from non-spatial and spatial models, which help to employ spatiotemporal multicellular models using calibrated ODE model parameters. We additionally investigate objects and processes implicitly represented by ODE model terms and parameters and improve the reproducibility of spatial, stochastic models.

    Conclusion

    We developed and demonstrate a method for generating spatiotemporal, multicellular models from non-spatial population dynamics models of multicellular systems. We envision employing our method to generate new ODE model terms from spatiotemporal and multicellular models, recast popular ODE models on a cellular basis, and generate better models for critical applications where spatial and stochastic features affect outcomes.

    « less
  4. We investigate approximate Bayesian inference techniques for nonlinear systems described by ordinary differential equation (ODE) models. In particular, the approximations will be based on set-valued reachability analysis approaches, yielding approximate models for the posterior distribution. Nonlinear ODEs are widely used to mathematically describe physical and biological models. However, these models are often described by parameters that are not directly measurable and have an impact on the system behaviors. Often, noisy measurement data combined with physical/biological intuition serve as the means for finding appropriate values of these parameters.Our approach operates under a Bayesian framework, given prior distribution over the parameter spacemore »and noisy observations under a known sampling distribution. We explore subsets of the space of model parameters, computing bounds on the likelihood for each subset. This is performed using nonlinear set-valued reachability analysis that is made faster by means of linearization around a reference trajectory. The tiling of the parameter space can be adaptively refined to make bounds on the likelihood tighter. We evaluate our approach on a variety of nonlinear benchmarks and compare our results with Markov Chain Monte Carlo and Sequential Monte Carlo approaches.

    « less
  5. Abstract

    Advances in single-cell technologies allow scrutinizing of heterogeneous cell states, however, detecting cell-state transitions from snap-shot single-cell transcriptome data remains challenging. To investigate cells with transient properties or mixed identities, we present MuTrans, a method based on multiscale reduction technique to identify the underlying stochastic dynamics that prescribes cell-fate transitions. By iteratively unifying transition dynamics across multiple scales, MuTrans constructs the cell-fate dynamical manifold that depicts progression of cell-state transitions, and distinguishes stable and transition cells. In addition, MuTrans quantifies the likelihood of all possible transition trajectories between cell states using coarse-grained transition path theory. Downstream analysis identifies distinctmore »genes that mark the transient states or drive the transitions. The method is consistent with the well-established Langevin equation and transition rate theory. Applying MuTrans to datasets collected from five different single-cell experimental platforms, we show its capability and scalability to robustly unravel complex cell fate dynamics induced by transition cells in systems such as tumor EMT, iPSC differentiation and blood cell differentiation. Overall, our method bridges data-driven and model-based approaches on cell-fate transitions at single-cell resolution.

    « less