skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Patterns in Protein Flexibility: A Comparison of NMR “Ensembles”, MD Trajectories, and Crystallographic B-Factors
Proteins are molecular machines requiring flexibility to function. Crystallographic B-factors and Molecular Dynamics (MD) simulations both provide insights into protein flexibility on an atomic scale. Nuclear Magnetic Resonance (NMR) lacks a universally accepted analog of the B-factor. However, a lack of convergence in atomic coordinates in an NMR-based structure calculation also suggests atomic mobility. This paper describes a pattern in the coordinate uncertainties of backbone heavy atoms in NMR-derived structural “ensembles” first noted in the development of FindCore2 (previously called Expanded FindCore: DA Snyder, J Grullon, YJ Huang, R Tejero, GT Montelione, Proteins: Structure, Function, and Bioinformatics 82 (S2), 219–230) and demonstrates that this pattern exists in coordinate variances across MD trajectories but not in crystallographic B-factors. This either suggests that MD trajectories and NMR “ensembles” capture motional behavior of peptide bond units not captured by B-factors or indicates a deficiency common to force fields used in both NMR and MD calculations.  more » « less
Award ID(s):
1909824
PAR ID:
10316273
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Molecules
Volume:
26
Issue:
5
ISSN:
1420-3049
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. de Groot, Bert L. (Ed.)
    Intrinsically disordered proteins (IDPs) are highly dynamic systems that play an important role in cell signaling processes and their misfunction often causes human disease. Proper understanding of IDP function not only requires the realistic characterization of their three-dimensional conformational ensembles at atomic-level resolution but also of the time scales of interconversion between their conformational substates. Large sets of experimental data are often used in combination with molecular modeling to restrain or bias models to improve agreement with experiment. It is shown here for the N-terminal transactivation domain of p53 (p53TAD) and Pup, which are two IDPs that fold upon binding to their targets, how the latest advancements in molecular dynamics (MD) simulations methodology produces native conformational ensembles by combining replica exchange with series of microsecond MD simulations. They closely reproduce experimental data at the global conformational ensemble level, in terms of the distribution properties of the radius of gyration tensor, and at the local level, in terms of NMR properties including 15 N spin relaxation, without the need for reweighting. Further inspection revealed that 10–20% of the individual MD trajectories display the formation of secondary structures not observed in the experimental NMR data. The IDP ensembles were analyzed by graph theory to identify dominant inter-residue contact clusters and characteristic amino-acid contact propensities. These findings indicate that modern MD force fields with residue-specific backbone potentials can produce highly realistic IDP ensembles sampling a hierarchy of nano- and picosecond time scales providing new insights into their biological function. 
    more » « less
  2. Abstract Proteins are inherently dynamic, and their conformational ensembles are functionally important in biology. Large-scale motions may govern protein structure–function relationship, and numerous transient but stable conformations of intrinsically disordered proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging both experimentally and computationally. In this paper we first introduced an unsupervised deep learning-based model, termed Internal Coordinate Net (ICoN), which learns the physical principles of conformational changes from molecular dynamics (MD) simulation data. Second, we selected interpolating data points in the learned latent space that rapidly identify novel synthetic conformations with sophisticated and large-scale sidechains and backbone arrangements. Third, with the highly dynamic amyloid-β1-42(Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42’s conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that can be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct sidechain rearrangements that are probed by our EPR and amino acid substitution studies. This approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability for deep learning to utilize learned natural atomistic motions in protein conformation sampling. 
    more » « less
  3. Abstract Proteins are the active players in performing essential molecular activities throughout biology, and their dynamics has been broadly demonstrated to relate to their mechanisms. The intrinsic fluctuations have often been used to represent their dynamics and then compared to the experimental B‐factors. However, proteins do not move in a vacuum and their motions are modulated by solvent that can impose forces on the structure. In this paper, we introduce a new structural concept, which has been called the structural compliance, for the evaluation of the global and local deformability of the protein structure in response to intramolecular and solvent forces. Based on the application of pairwise pulling forces to a protein elastic network, this structural quantity has been computed and sometimes is even found to yield an improved correlation with the experimental B‐factors, meaning that it may serve as a better metric for protein flexibility. The inverse of structural compliance, namely the structural stiffness, has also been defined, which shows a clear anticorrelation with the experimental data. Although the present applications are made to proteins, this approach can also be applied to other biomolecular structures such as RNA. This present study considers only elastic network models, but the approach could be applied further to conventional atomic molecular dynamics. Compliance is found to have a slightly better agreement with the experimental B‐factors, perhaps reflecting its bias toward the effects of local perturbations, in contrast to mean square fluctuations. The code for calculating protein compliance and stiffness is freely accessible athttps://jerniganlab.github.io/Software/PACKMAN/Tutorials/compliance. 
    more » « less
  4. Abstract DNA exhibits local conformational preferences that affect its ability to adopt biologically relevant conformations, such as those required for binding proteins. Traditional methods, like Markov state models and molecular dynamics (MD) simulations, have advanced our understanding but often struggle to capture these rare conformational states due to high computational demands. Here, we introduce a novel AI framework based on dynamical graphical models (DGMs), a generative machine learning approach trained on equilibrium MD data, to predict DNA conformational transitions that are never seen in the MD ensembles. By leveraging local DNA interactions, DGMs generate a comprehensive transition matrix that captures both thermodynamic and kinetic properties of unsampled states, enabling accurate predictions of rare global conformations without the need for extensive sampling. Applying this model to the B→A transition, we demonstrate that DGMs can efficiently predict sequence-dependent A-DNA preferences, achieving results that align closely with replica exchange umbrella sampling simulations. DGMs provide new insights into DNA sequence–structure relationships, paving the way for applications in DNA sequence design and optimization. 
    more » « less
  5. Abstract Molecular dynamics (MD) simulations are immensely valuable for studying protein structure, function and dynamics. Their ability to capture atomic‐level behavior of molecules and describe their evolution over time makes it a powerful synergistic tool for biochemistry, structural biology and other life sciences. To advance research and knowledge on reasonable timescales, researchers must optimize the amount of useful information extracted from simulation data while often frugally managing computational resources. Often, this involves balancing the length of MD trajectories with the number of replicas of a given system, with the aim of maximizing sampling of the conformational landscape. However, identifying this balance is not always intuitive, and the lack of standards among researchers can produce large variability in results and predictions from MD measurements. Here, we investigate the variability in MD results when simulation length and replica numbers are varied. Using a 231‐amino acid domain, we compare measurements from independent trajectories to a benchmark trajectory of 3, 1000‐ns replicates. We perform these simulations on 27 protein‐ligand complexes, allowing us to compare ligand‐specific rankings of complexes across independent replicas. Our results reveal that some MD measurements are accurately ranked by single trajectories, while others are not. We uncover similar variability in the effects of trajectory lengths on measurements. Our findings suggest that a one‐size‐fits‐all approach to MD simulations is not necessarily the best approach, and depending on the intended measurements and research question, it may be advantageous sometimes to prioritize longer trajectories over multiple replicas. This work provides important considerations for researchers while designing simulation studies. 
    more » « less