Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous all-atom molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous all-atom simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce all-atom molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates.
more »
« less
Building insightful, memory-enriched models to capture long-time biochemical processes from short-time simulations
The ability to predict and understand complex molecular motions occurring over diverse timescales ranging from picoseconds to seconds and even hours in biological systems remains one of the largest challenges to chemical theory. Markov state models (MSMs), which provide a memoryless description of the transitions between different states of a biochemical system, have provided numerous important physically transparent insights into biological function. However, constructing these models often necessitates performing extremely long molecular simulations to converge the rates. Here, we show that by incorporating memory via the time-convolutionless generalized master equation (TCL-GME) one can build a theoretically transparent and physically intuitive memory-enriched model of biochemical processes with up to a three order of magnitude reduction in the simulation data required while also providing a higher temporal resolution. We derive the conditions under which the TCL-GME provides a more efficient means to capture slow dynamics than MSMs and rigorously prove when the two provide equally valid and efficient descriptions of the slow configurational dynamics. We further introduce a simple averaging procedure that enables our TCL-GME approach to quickly converge and accurately predict long-time dynamics even when parameterized with noisy reference data arising from short trajectories. We illustrate the advantages of the TCL-GME using alanine dipeptide, the human argonaute complex, and FiP35 WW domain.
more »
« less
- Award ID(s):
- 2154291
- PAR ID:
- 10409438
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 120
- Issue:
- 12
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Lee, Jonghyun; Darve, Eric F.; Kitanidis, Peter K.; Mahoney, Michael W.; Karpatne, Anuj; Farthing, Matthew W.; Hesser, Tyler (Ed.)Modern design, control, and optimization often require multiple expensive simulations of highly nonlinear stiff models. These costs can be amortized by training a cheap surrogate of the full model, which can then be used repeatedly. Here we present a general data-driven method, the continuous time echo state network (CTESN), for generating surrogates of nonlinear ordinary differential equations with dynamics at widely separated timescales. We empirically demonstrate the ability to accelerate a physically motivated scalable model of a heating system by 98x while maintaining relative error of within 0.2 %. We showcase the ability for this surrogate to accurately handle highly stiff systems which have been shown to cause training failures with common surrogate methods such as Physics-Informed Neural Networks (PINNs), Long Short Term Memory (LSTM) networks, and discrete echo state networks (ESN). We show that our model captures fast transients as well as slow dynamics, while demonstrating that fixed time step machine learning techniques are unable to adequately capture the multi-rate behavior. Together this provides compelling evidence for the ability of CTESN surrogates to predict and accelerate highly stiff dynamical systems which are unable to be directly handled by previous scientific machine learning techniques.more » « less
-
null (Ed.)Long-range synchrony from short-range interactions is a familiar pattern in biological and physical systems, many of which share a common set of ‘universal’ properties at the point of synchronization. Common biological systems of coupled oscillators have been shown to be members of the Ising universality class, meaning that the very simple Ising model replicates certain spatial statistics of these systems at stationarity. This observation is useful because it reveals which aspects of spatial pattern arise independently of the details governing local dynamics, resulting in both deeper understanding of and a simpler baseline model for biological synchrony. However, in many situations a system’s dynamics are of greater interest than their static spatial properties. Here, we ask whether a dynamical Ising model can replicate universal and non-universal features of ecological systems, using noisy coupled metapopulation models with two-cycle dynamics as a case study. The standard Ising model makes unrealistic dynamical predictions, but the Ising model with memory corrects this by using an additional parameter to reflect the tendency for local dynamics to maintain their phase of oscillation. By fitting the two parameters of the Ising model with memory to simulated ecological dynamics, we assess the correspondence between the Ising and ecological models in several of their features (location of the critical boundary in parameter space between synchronous and asynchronous dynamics, probability of local phase changes and ability to predict future dynamics). We find that the Ising model with memory is reasonably good at representing these properties of ecological metapopulations. The correspondence between these models creates the potential for the simple and well-known Ising class of models to become a valuable tool for understanding complex biological systems.more » « less
-
Abstract Long and stable timescales are often observed in complex biochemical networks, such as in emergent oscillations. How these robust dynamics persist remains unclear, given the many stochastic reactions and shorter time scales demonstrated by underlying components. We propose a topological model that produces long oscillations around the network boundary, reducing the system dynamics to a lower-dimensional current in a robust manner. Using this to model KaiC, which regulates the circadian rhythm in cyanobacteria, we compare the coherence of oscillations to that in other KaiC models. Our topological model localizes currents on the system edge, with an efficient regime of simultaneously increased precision and decreased cost. Further, we introduce a new predictor of coherence from the analysis of spectral gaps, and show that our model saturates a global thermodynamic bound. Our work presents a new mechanism and parsimonious description for robust emergent oscillations in complex biological networks.more » « less
-
Phase change memory devices become practical for non-volatile storage at small dimensions due to reduced power and predictable device operation. In larger scale cells, devices can be locally melted due to filament formation and liquid filaments can be retained in parts of the cell for a long time even if most or all of the cells are initially amorphized during long fall-times. The complex amorphization and crystallization dynamics make these large cells more unpredictable and enable their applications as physically unclonable functions (PUF) [1,2]. Computational analysis of the complex amorphization-crystallization dynamics in phase change memory devices with large geometries is important to understand the evolution of phase distributions and temperature profiles during programming of these devices. In this work, we conduct electrothermal finite element simulations of reset operation on a large Ge2Sb2Te5 (GST) cell using the framework we have developed in COMSOL multiphysics [3]-[9] and analyze the complex dynamics of amorphization, nucleation and growth during electrical stress. We input voltage waveforms measured from electrical characterization of on-oxide GST line cells with bottom metal contact pads and Si3N4 capping. A 2D polycrystalline model of the experimentally measured cells (~360 nm wide, ~400 nm long and ~50 nm thick) is constructed in the simulations. Access devices are modeled using the spice models. The simulations capture some of the interplay between changes in the device resistance due to heating and phase changes and current fluctuations.more » « less
An official website of the United States government

