Molecular dynamics (MD) simulations are fundamental computational tools for the study of proteins and their free energy landscapes. However, sampling protein conformational changes through MD simulations is challenging due to the relatively long time scales of these processes. Many enhanced sampling approaches have emerged to tackle this problem, including biased sampling and path-sampling methods. In this Perspective, we focus on adaptive sampling algorithms. These techniques differ from other approaches because the thermodynamic ensemble is preserved and the sampling is enhanced solely by restarting MD trajectories at particularly chosen seeds rather than introducing biasing forces. We begin our treatment with an overview of theoretically transparent methods, where we discuss principles and guidelines for adaptive sampling. Then, we present a brief summary of select methods that have been applied to realistic systems in the past. Finally, we discuss recent advances in adaptive sampling methodology powered by deep learning techniques, as well as their shortcomings.
more »
« less
Active Learning of the Conformational Ensemble of Proteins Using Maximum Entropy VAMPNets
Rapid computational exploration of the free energy landscape of biological molecules remains an active area of research due to the difficulty of sampling rare state transitions in molecular dynamics (MD) simulations. In recent years, an increasing number of studies have exploited machine learning (ML) models to enhance and analyze MD simulations. Notably, unsupervised models that extract kinetic information from a set of parallel trajectories have been proposed including the variational approach for Markov processes (VAMP), VAMPNets, and time-lagged variational autoencoders (TVAE). In this work, we propose a combination of adaptive sampling with active learning of kinetic models to accelerate the discovery of the conformational landscape of biomolecules. In particular, we introduce and compare several techniques that combine kinetic models with two adaptive sampling regimes (least counts and multiagent reinforcement learning- based adaptive sampling) to enhance the exploration of conformational ensembles without introducing biasing forces. Moreover, inspired by the active learning approach of uncertainty-based sampling, we also present MaxEnt VAMPNet. This technique consists of restarting simulations from the microstates that maximize the Shannon entropy of a VAMPNet trained to perform the soft discretization of metastable states. By running simulations on two test systems, the WLALL pentapeptide and the villin headpiece subdomain, we empirically demonstrate that MaxEnt VAMPNet results in faster exploration of conformational landscapes compared with the baseline and other proposed methods.
more »
« less
- Award ID(s):
- 1845606
- PAR ID:
- 10407582
- Date Published:
- Journal Name:
- Journal of Chemical Theory and Computation
- ISSN:
- 1549-9618
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Schmidt-Krey, Ingeborg; Gumbart, James C. (Ed.)Molecular dynamics (MD) simulations are routinely used to study structural dynamics of membrane proteins. However, conventional MD is often unable to sample functionally important conformational transitions of membrane proteins such as those involved in active membrane transport or channel activation process. Here we describe a combination of multiple MD based techniques that allows for a rigorous characterization of energetics and kinetics of large-scale conformational changes in membrane proteins. The methodology is based on biased, nonequilibrium, collective-variable based simulations including nonequilibrium pulling, string method with swarms of trajectories, bias-exchange umbrella sampling, and rate estimation techniques.more » « less
-
Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous all-atom molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous all-atom simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce all-atom molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates.more » « less
-
Molecular dynamics (MD) is the method of choice for understanding the structure, function, and interactions of molecules. However, MD simulations are limited by the strong metastability of many molecules, which traps them in a single conformation basin for an extended amount of time. Enhanced sampling techniques, such as metadynamics and replica exchange, have been developed to overcome this limitation and accelerate the exploration of complex free energy landscapes. In this paper, we propose Vendi Sampling, a replica-based algorithm for increasing the efficiency and efficacy of the exploration of molecular conformation spaces. In Vendi sampling, replicas are simulated in parallel and coupled via a global statistical measure, the Vendi Score, to enhance diversity. Vendi sampling allows for the recovery of unbiased sampling statistics and dramatically improves sampling efficiency. We demonstrate the effectiveness of Vendi sampling in improving molecular dynamics simulations by showing significant improvements in coverage and mixing between metastable states and convergence of free energy estimates for four common benchmarks, including Alanine Dipeptide and Chignolin.more » « less
-
Molecular dynamics (MD) simulations provide a powerful means of exploring the dynamic behavior of biomolecular systems at the atomic level. However, analyzing the vast data sets generated by MD simulations poses significant challenges. This article discusses the energy landscape visualization method (ELViM), a multidimensional reduction technique inspired by the energy landscape theory. ELViM transcends one-dimensional representations, offering a comprehensive analysis of the effective conformational phase space without the need for predefined reaction coordinates. We apply the ELViM to study the folding landscape of the antimicrobial peptide Polybia-MP1, showcasing its versatility in capturing complex biomolecular dynamics. Using dissimilarity matrices and a force-scheme approach, the ELViM provides intuitive visualizations, revealing structural correlations and local conformational signatures. The method is demonstrated to be adaptable, robust, and applicable to various biomolecular systems.more » « less
An official website of the United States government

