skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine learned coarse-grained protein force-fields: Are we there yet?
The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higherbody interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarsegrained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.  more » « less
Award ID(s):
2019745
PAR ID:
10512284
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
www.sciencedirect.com
Date Published:
Journal Name:
Current Opinion in Structural Biology
Volume:
79
Issue:
C
ISSN:
0959-440X
Page Range / eLocation ID:
102533
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Wei, Guanghong (Ed.)
    Biomolecular condensates are important structures in various cellular processes but are challenging to study using traditional experimental techniques. In silico simulations with residue-level coarse-grained models strike a balance between computational efficiency and chemical accuracy. They could offer valuable insights by connecting the emergent properties of these complex systems with molecular sequences. However, existing coarse-grained models often lack easy-to-follow tutorials and are implemented in software that is not optimal for condensate simulations. To address these issues, we introduce OpenABC, a software package that greatly simplifies the setup and execution of coarse-grained condensate simulations with multiple force fields using Python scripting. OpenABC seamlessly integrates with the OpenMM molecular dynamics engine, enabling efficient simulations with performance on a single GPU that rivals the speed achieved by hundreds of CPUs. We also provide tools that convert coarse-grained configurations to all-atom structures for atomistic simulations. We anticipate that OpenABC will significantly facilitate the adoption of in silico simulations by a broader community to investigate the structural and dynamical properties of condensates. 
    more » « less
  2. Bottom-up coarse-grained (CG) modeling is an effective means of bypassing the limited spatiotemporal scales of conventional atomistic molecular dynamics while retaining essential information from the atomistic model. A central challenge in CG modeling is the trade-off between accuracy and efficiency, as the inclusion of often pivotal many-body interaction terms in the CG force-field renders simulation markedly slower than simple pairwise models. The Ultra Coarse-Graining (UCG) method incorporates many-body terms through discrete internal state variables that modulate the CG force-field according to, e.g., changes in local environment when substantial chemical heterogeneities exist. However, assigning optimal internal states systematically from atomistic simulation data, as well as the practical application of bottom-up UCG theory to biomolecular systems, remain open problems. We develop two synergistic methods to aid in the development of UCG models that can capture inhomogeneities in atomistic systems such as those induced by phase coexistence. The first method establishes the systematic construction of UCG force-fields from a relative entropy minimization principle, while the second method utilizes machine-learning to obtain optimal local order parameters for enhanced model efficiency and transferability. We apply these methods to a methanol liquid–vapor interface and the ripple phase of a 1,2-dipalmitoyl-sn-glycero-3-phosphocholine lipid bilayer and demonstrate that UCG modeling alone recapitulates aspects of phase coexistence that are otherwise not observed in CG modeling. 
    more » « less
  3. null (Ed.)
    We developed coarse-grained models of spike proteins in SARS-CoV-2 coronavirus and angiotensin-converting enzyme 2 (ACE2) receptor proteins to study the endocytosis of a whole coronavirus under physiologically relevant spatial and temporal scales. We first conducted all-atom explicit-solvent molecular dynamics simulations of the recently characterized structures of spike and ACE2 proteins. We then established coarse-grained models using the shape-based coarse-graining approach based on the protein crystal structures and extracted the force field parameters from the all-atom simulation trajectories. To further analyze the coarse-grained models, we carried out normal mode analysis of the coarse-grained models to refine the force field parameters by matching the fluctuations of the internal coordinates with the original all-atom simulations. Finally, we demonstrated the capability of these coarse-grained models by simulating the endocytosis of a whole coronavirus through the host cell membrane. We embedded the coarse-grained models of spikes on the surface of the virus envelope and anchored ACE2 receptors on the host cell membrane, which is modeled using a one-particle-thick lipid bilayer model. The coarse-grained simulations show the spike proteins adopt bent configurations due to their unique flexibility during their interaction with the ACE2 receptors, which makes it easier for them to attach to the host cell membrane than rigid spikes. 
    more » « less
  4. Coarse-grained models describe the macroscopic mean response of a process at large scales, which derives from stochastic processes at small scales. Common examples include accounting for velocity fluctuations in a turbulent fluid flow model and cloud evolution in climate models. Most existing techniques for constructing coarse-grained models feature ill-defined parameters whose values are arbitrarily chosen (e.g., a window size), are narrow in their applicability (e.g., only applicable to time series or spatial data), or cannot readily incorporate physics information. Here, we introduce the concept of physics-guided Gaussian process regression as a machine-learning-based coarse-graining technique that is broadly applicable and amenable to input from known physics-based relationships. Using a pair of case studies derived from molecular dynamics simulations, we demonstrate the attractive properties and superior performance of physics-guided Gaussian processes for coarse-graining relative to prevalent benchmarks. The key advantage of Gaussian-process-based coarse-graining is its ability to seamlessly integrate data-driven and physics-based information. 
    more » « less
  5. Coarse-grained (CG) molecular dynamics can be a powerful method for probing complex processes. However, most CG force fields use pairwise nonbonded interaction potentials sets, which can limit their ability to capture complex multi-body phenomena such as the hydrophobic effect. As the hydrophobic effect primarily manifests itself due to the nonpolar solute affecting the nearby hydrogen bonding network in water, capturing such effects using a simple one CG site or “bead” water model is a challenge. In this work, we systematically test the ability of CG one site water models for capturing critical features of the solvent environment around a hydrophobe as well as the potential of mean force (PMF) of neopentane association. We study two bottom-up models: a simple pairwise (SP) force-matched water model constructed using the multiscale coarse-graining method and the Bottom-Up Many-Body Projected Water (BUMPer) model, which has implicit three-body correlations. We also test the top-down monatomic (mW) and the Machine Learned mW (ML-mW) water models. The mW models perform well in capturing structural correlations but not the energetics of the PMF. BUMPer outperforms SP in capturing structural correlations and also gives an accurate PMF in contrast to the two mW models. Our study highlights the importance of including three-body interactions in CG water models, either explicitly or implicitly, while in general highlighting the applicability of bottom-up CG water models for studying hydrophobic effects in a quantitative fashion. This assertion comes with a caveat, however, regarding the accuracy of the enthalpy–entropy decomposition of the PMF of hydrophobe association. 
    more » « less