skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Molecular Latent Space Simulators
Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous all-atom molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous all-atom simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce all-atom molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates.  more » « less
Award ID(s):
1841805
PAR ID:
10186845
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Chemical Science
ISSN:
2041-6520
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This paper introduces a computational framework to reconstruct and forecast a partially observed state that evolves according to an unknown or expensive-to-simulate dynamical system. Our reduced-order autodifferentiable ensemble Kalman filters (ROAD-EnKFs) learn a latent low-dimensional surrogate model for the dynamics and a decoder that maps from the latent space to the state space. The learned dynamics and decoder are then used within an EnKF to reconstruct and forecast the state. Numerical experiments show that if the state dynamics exhibit a hidden low-dimensional structure, ROAD-EnKFs achieve higher accuracy at lower computational cost compared to existing methods. If such structure is not expressed in the latent state dynamics, ROAD-EnKFs achieve similar accuracy at lower cost, making them a promising approach for surrogate state reconstruction and forecasting. 
    more » « less
  2. In this paper, we first introduce a variational formulation of the Unit Commitment (UC) problem, in which generation and ramping trajectories of the generating units are continuous time signals and the generating units cost depends on the three signals: the binary commitment status of the units as well as their continuous-time generation and ramping trajectories. We assume such bids are piecewise strictly convex time-varying linear functions of these three variables. Based on this problem derive a tractable approximation by constraining the commitment trajectories to switch in a discrete and finite set of points and representing the trajectories in the function space of piece-wise polynomial functions within the intervals, whose discrete coefficients are then the UC problem decision variables. Our judicious choice of the signal space allows us to represent cost and constraints as linear functions of such coefficients, thus, our UC models preserves the MILP formulation of the UC problem. Numerical simulation over real load data from the California ISO demonstrate that the proposed UC model reduces the total dayahead and real-time operation cost, and the number of ramping scarcity events in the real-time operations. 
    more » « less
  3. Abstract We perform a geomagnetic event simulation using a newly developed magnetohydrodynamic with adaptively embedded particle‐in‐cell (MHD‐AEPIC) model. We have developed effective criteria to identify reconnection sites in the magnetotail and cover them with the PIC model. The MHD‐AEPIC simulation results are compared with Hall MHD and ideal MHD simulations to study the impacts of kinetic reconnection at multiple physical scales. At the global scale, the three models produce very similar SYM‐H and SuperMag Electrojet indexes, which indicates that the global magnetic field configurations from the three models are very close to each other. We also compare the ionospheric solver results and all three models generate similar polar cap potentials and field‐aligned currents. At the mesoscale, we compare the simulations with in situ Geotail observations in the tail. All three models produce reasonable agreement with the Geotail observations. At the kinetic scales, the MHD‐AEPIC simulation can produce a crescent shape distribution of the electron velocity space at the electron diffusion region, which agrees very well with MMS observations near a tail reconnection site. These electron scale kinetic features are not available in either the Hall MHD or ideal MHD models. Overall, the MHD‐AEPIC model compares well with observations at all scales, it works robustly, and the computational cost is acceptable due to the adaptive adjustment of the PIC domain. It remains to be determined whether kinetic physics can play a more significant role in other types of events, including but not limited to substorms. 
    more » « less
  4. Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution. 
    more » « less
  5. Single-molecule Förster resonance energy transfer (smFRET) is an experimental methodology to track the real-time dynamics of molecules using fluorescent probes to follow one or more intramolecular distances. These distances provide a low-dimensional representation of the full atomistic dynamics. Under mild technical conditions, Takens’ Delay Embedding Theorem guarantees that the full three-dimensional atomistic dynamics of a system are diffeomorphic (i.e., related by a smooth and invertible transformation) to a time-delayed embedding of one or more scalar observables. Appealing to these theoretical guarantees, we employ manifold learning, artificial neural networks, and statistical mechanics to learn from molecular simulation training data the a priori unknown transformation between the atomic coordinates and delay-embedded intramolecular distances accessible to smFRET. This learned transformation may then be used to reconstruct atomistic coordinates from smFRET time series data. We term this approach Single-molecule TAkens Reconstruction (STAR). We have previously applied STAR to reconstruct molecular configurations of a C24H50 polymer chain and the mini-protein Chignolin with accuracies better than 0.2 nm from simulated smFRET data under noise free and high time resolution conditions. In the present work, we investigate the role of signal-to-noise ratio, data volume, and time resolution in simulated smFRET data to assess the performance of STAR under conditions more representative of experimental realities. We show that STAR can reconstruct the Chignolin and Villin mini-proteins to accuracies of 0.12 and 0.42 nm, respectively, and place bounds on these conditions for accurate reconstructions. These results demonstrate that it is possible to reconstruct dynamical trajectories of protein folding from time series in noisy, time binned, experimentally measurable observables and lay the foundations for the application of STAR to real experimental data. 
    more » « less