Abstract The prediction of (un)binding rates and free energies is of great significance to the drug design process. Although many enhanced sampling algorithms and approaches have been developed, there is not yet a reliable workflow to predict these quantities. Previously we have shown that free energies and transition rates can be calculated by directly simulating the binding and unbinding processes with our variant of the WE algorithm “Resampling of Ensembles by Variation Optimization”, or “REVO”. Here, we calculate binding free energies retrospectively for three SAMPL6 host‐guest systems and prospectively for a SAMPL9 system to test a modification of REVO that restricts its cloning behavior in quasi‐unbound states. Specifically, trajectories cannot clone if they meet a physical requirement that represents a high likelihood of unbinding, which in the case of this work is a center‐of‐mass to center‐of‐mass distance. The overall effect of this change was difficult to predict, as it results in fewer unbinding events each of which with a much higher statistical weight. For all four systems tested, this new strategy produced either more accurate unbinding free energies or more consistent results between simulations than the standard REVO algorithm. This approach is highly flexible, and any feature of interest for a system can be used to determine cloning eligibility. These findings thus constitute an important improvement in the calculation of transition rates and binding free energies with the weighted ensemble method.
more »
« less
REVO: Resampling of ensembles by variation optimization
Conventional molecular dynamics simulations are incapable of sampling many important interactions in biomolecular systems due to their high dimensionality and rough energy landscapes. To observe rare events and calculate transition rates in these systems, enhanced sampling is a necessity. In particular, the study of ligand-protein interactions necessitates a diverse ensemble of protein conformations and transition states, and for many systems, this occurs on prohibitively long time scales. Previous strategies such as WExplore that can be used to determine these types of ensembles are hindered by problems related to the regioning of conformational space. Here, we propose a novel, regionless, enhanced sampling method that is based on the weighted ensemble framework. In this method, a value referred to as “trajectory variation” is optimized after each cycle through cloning and merging operations. This method allows for a more consistent measurement of observables and broader sampling resulting in the efficient exploration of previously unexplored conformations. We demonstrate the performance of this algorithm with the N-dimensional random walk and the unbinding of the trypsin-benzamidine system. The system is analyzed using conformation space networks, the residence time of benzamidine is confirmed, and a new unbinding pathway for the trypsin-benzamidine system is found. We expect that resampling of ensembles by variation optimization will be a useful general tool to broadly explore free energy landscapes.
more »
« less
- Award ID(s):
- 1761320
- PAR ID:
- 10593486
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- The Journal of Chemical Physics
- Volume:
- 150
- Issue:
- 24
- ISSN:
- 0021-9606
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Proteins are inherently dynamic, and their conformational ensembles are functionally important in biology. Large-scale motions may govern protein structure–function relationship, and numerous transient but stable conformations of intrinsically disordered proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging both experimentally and computationally. In this paper we first introduced an unsupervised deep learning-based model, termed Internal Coordinate Net (ICoN), which learns the physical principles of conformational changes from molecular dynamics (MD) simulation data. Second, we selected interpolating data points in the learned latent space that rapidly identify novel synthetic conformations with sophisticated and large-scale sidechains and backbone arrangements. Third, with the highly dynamic amyloid-β1-42(Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42’s conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that can be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct sidechain rearrangements that are probed by our EPR and amino acid substitution studies. This approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability for deep learning to utilize learned natural atomistic motions in protein conformation sampling.more » « less
-
Molecular dynamics (MD) simulations are fundamental computational tools for the study of proteins and their free energy landscapes. However, sampling protein conformational changes through MD simulations is challenging due to the relatively long time scales of these processes. Many enhanced sampling approaches have emerged to tackle this problem, including biased sampling and path-sampling methods. In this Perspective, we focus on adaptive sampling algorithms. These techniques differ from other approaches because the thermodynamic ensemble is preserved and the sampling is enhanced solely by restarting MD trajectories at particularly chosen seeds rather than introducing biasing forces. We begin our treatment with an overview of theoretically transparent methods, where we discuss principles and guidelines for adaptive sampling. Then, we present a brief summary of select methods that have been applied to realistic systems in the past. Finally, we discuss recent advances in adaptive sampling methodology powered by deep learning techniques, as well as their shortcomings.more » « less
-
The Class I Major Histocompatibility Complex (MHC) is a central protein in immunology as it binds to intracellular peptides and displays them at the cell surface for recognition by T-cells. The structural analysis of bound peptide-MHC complexes (pMHCs) holds the promise of interpretable and general binding prediction (i.e., testing whether a given peptide binds to a given MHC). However, structural analysis is limited in part by the difficulty in modelling pMHCs given the size and flexibility of the peptides that can be presented by MHCs. This article describes APE-Gen (Anchored Peptide-MHC Ensemble Generator), a fast method for generating ensembles of bound pMHC conformations. APE-Gen generates an ensemble of bound conformations by iterated rounds of (i) anchoring the ends of a given peptide near known pockets in the binding site of the MHC, (ii) sampling peptide backbone conformations with loop modelling, and then (iii) performing energy minimization to fix steric clashes, accumulating conformations at each round. APE-Gen takes only minutes on a standard desktop to generate tens of bound conformations, and we show the ability of APE-Gen to sample conformations found in X-ray crystallography even when only sequence information is used as input. APE-Gen has the potential to be useful for its scalability (i.e., modelling thousands of pMHCs or even non-canonical longer peptides) and for its use as a flexible search tool. We demonstrate an example for studying cross-reactivity.more » « less
-
Masel, J (Ed.)Abstract Epistasis—when mutations combine nonadditively—is a profoundly important aspect of biology. It is often difficult to understand its mechanistic origins. Here, we show that epistasis can arise from the thermodynamic ensemble, or the set of interchanging conformations a protein adopts. Ensemble epistasis occurs because mutations can have different effects on different conformations of the same protein, leading to nonadditive effects on its average, observable properties. Using a simple analytical model, we found that ensemble epistasis arises when two conditions are met: (1) a protein populates at least three conformations and (2) mutations have differential effects on at least two conformations. To explore the relative magnitude of ensemble epistasis, we performed a virtual deep-mutational scan of the allosteric Ca2+ signaling protein S100A4. We found that 47% of mutation pairs exhibited ensemble epistasis with a magnitude on the order of thermal fluctuations. We observed many forms of epistasis: magnitude, sign, and reciprocal sign epistasis. The same mutation pair could even exhibit different forms of epistasis under different environmental conditions. The ubiquity of thermodynamic ensembles in biology and the pervasiveness of ensemble epistasis in our dataset suggests that it may be a common mechanism of epistasis in proteins and other macromolecules.more » « less
An official website of the United States government
