skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine learning coarse-grained potentials of protein thermodynamics
Abstract A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.  more » « less
Award ID(s):
2019745
PAR ID:
10512354
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Communications
Date Published:
Journal Name:
Nature Communications
Volume:
14
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We developed coarse-grained models of spike proteins in SARS-CoV-2 coronavirus and angiotensin-converting enzyme 2 (ACE2) receptor proteins to study the endocytosis of a whole coronavirus under physiologically relevant spatial and temporal scales. We first conducted all-atom explicit-solvent molecular dynamics simulations of the recently characterized structures of spike and ACE2 proteins. We then established coarse-grained models using the shape-based coarse-graining approach based on the protein crystal structures and extracted the force field parameters from the all-atom simulation trajectories. To further analyze the coarse-grained models, we carried out normal mode analysis of the coarse-grained models to refine the force field parameters by matching the fluctuations of the internal coordinates with the original all-atom simulations. Finally, we demonstrated the capability of these coarse-grained models by simulating the endocytosis of a whole coronavirus through the host cell membrane. We embedded the coarse-grained models of spikes on the surface of the virus envelope and anchored ACE2 receptors on the host cell membrane, which is modeled using a one-particle-thick lipid bilayer model. The coarse-grained simulations show the spike proteins adopt bent configurations due to their unique flexibility during their interaction with the ACE2 receptors, which makes it easier for them to attach to the host cell membrane than rigid spikes. 
    more » « less
  2. Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution. 
    more » « less
  3. Fluctuations of protein three-dimensional structures and large-scale conformational transitions are crucial for the biological function of proteins and their complexes. Experimental studies of such phenomena remain very challenging and therefore molecular modeling can be a good alternative or a valuable supporting tool for the investigation of large molecular systems and long-time events. In this minireview, we present two alternative approaches to the coarse-grained (CG) modeling of dynamic properties of protein systems. We discuss two CG representations of polypeptide chains used for Monte Carlo dynamics simulations of protein local dynamics and conformational transitions, and highly simplified structure-based elastic network models of protein flexibility. In contrast to classical all-atom molecular dynamics, the modeling strategies discussed here allow the quite accurate modeling of much larger systems and longer-time dynamic phenomena. We briefly describe the main features of these models and outline some of their applications, including modeling of near-native structure fluctuations, sampling of large regions of the protein conformational space, or possible support for the structure prediction of large proteins and their complexes. 
    more » « less
  4. Nanodiscs are discoidal protein–lipid complexes that have wide applications in membrane protein studies. Modeling and simulation of nanodiscs are challenging due to the absence of structures of many membrane scaffold proteins (MSPs) that wrap around the membrane bilayer. We have developed CHARMM‐GUINanodisc Builder(http://www.charmm-gui.org/input/nanodisc) to facilitate the setup of nanodisc simulation systems by modeling the MSPs with defined size and known structural features. A total of 11 different nanodiscs with a diameter from 80 to 180 Å are made available in both the all‐atom CHARMM and two coarse‐grained (PACE and Martini) force fields. The usage of theNanodisc Builderis demonstrated with various simulation systems. The structures and dynamics of proteins and lipids in these systems were analyzed, showing similar behaviors to those from previous all‐atom and coarse‐grained nanodisc simulations. We expect theNanodisc Builderto be a convenient and reliable tool for modeling and simulation of nanodisc systems. © 2019 Wiley Periodicals, Inc. 
    more » « less
  5. Recent experiments have shown that complexation with a stabilizing compound can preserve enzyme activity in harsh environments. Such complexation is believed to be driven by noncovalent interactions at the enzyme surface, including hydrophobicity and electrostatics. Molecular modeling of these interactions is costly at the all-atom scale due to the long time scales and large particle counts needed to characterize binding. Protein structure at the scale of amino acid residues is parsimoniously represented by a coarse-grained model in which one particle represents several atoms, significantly reducing the cost of simulation. Coarse-grained models may then be used to generate reduced surface descriptions to underlie detailed theories of surface adhesion. In this study, we present two coarse-grained enzyme models—lipase and dehalogenase—that have been prepared using the Martini 3 top-down modeling framework. We simulate each enzyme in aqueous solution and calculate the statistics of protein surface features and shape descriptors. The values from the coarse-grained data are compared with the same calculations performed on all-atom reference systems, revealing key similarities of surface chemistry at the two scales. Structural measures are calculated from the all-atom reference systems and compared with estimates from small-angle x-ray scattering experiments, with good agreement between the two. The described procedures of modeling and analysis comprise a framework for the development of coarse-grained models of protein surfaces with validation to experiment. 
    more » « less