skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 21, 2025

Title: Expanding density-correlation machine learning representations for anisotropic coarse-grained particles
Physics-based, atom-centered machine learning (ML) representations have been instrumental to the effective integration of ML within the atomistic simulation community. Many of these representations build off the idea of atoms as having spherical, or isotropic, interactions. In many communities, there is often a need to represent groups of atoms, either to increase the computational efficiency of simulation via coarse-graining or to understand molecular influences on system behavior. In such cases, atom-centered representations will have limited utility, as groups of atoms may not be well-approximated as spheres. In this work, we extend the popular Smooth Overlap of Atomic Positions (SOAP) ML representation for systems consisting of non-spherical anisotropic particles or clusters of atoms. We show the power of this anisotropic extension of SOAP, which we deem AniSOAP, in accurately characterizing liquid crystal systems and predicting the energetics of Gay–Berne ellipsoids and coarse-grained benzene crystals. With our study of these prototypical anisotropic systems, we derive fundamental insights on how molecular shape influences mesoscale behavior and explain how to reincorporate important atom–atom interactions typically not captured by coarse-grained models. Moving forward, we propose AniSOAP as a flexible, unified framework for coarse-graining in complex, multiscale simulation.  more » « less
Award ID(s):
2309000
PAR ID:
10578282
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
AIP Publishing
Date Published:
Journal Name:
The Journal of Chemical Physics
Volume:
161
Issue:
7
ISSN:
0021-9606
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent experiments have shown that enzyme activity can preserved in harsh environments by complexing enzyme with polymer into a Protein Polymer Hybrid (PPH). In a successful PPH, heteropolymer strands bind to the enzyme surface and restrain the folded protein without adversely affecting the binding and active sites. It is believed that hybridization is driven by noncovalent interactions at the enzyme surface including hydrophobicity and electrostatics. Molecular modeling of these interactions is not practical at the all atom scale due to the long timescales and large particle counts needed to characterize binding. Protein structure at the scale of amino acid residues is parsimoniously represented by a coarse grained model in which one particle represents several atoms, significantly reducing the cost of simulation. In this study we present two coarse grained enzyme models, lipase and dehalogenase, prepared using a top down modeling strategy. We simulate each enzyme in aqueous solution and calculate statistics of protein surface features and shape descriptors. The values from the coarse grained data are compared with the same calculations performed on all atom reference systems, revealing key similarities of surface chemistry at the two scales. Structural measures are calculated from the all-atom reference systems and compared with estimates from small angle X ray scattering (SAXS) experiments, with good agreement between the two. The described procedures of modeling and analysis comprise a framework for the development of coarse-grained models of protein surfaces with validation to experiment. 
    more » « less
  2. null (Ed.)
    We developed coarse-grained models of spike proteins in SARS-CoV-2 coronavirus and angiotensin-converting enzyme 2 (ACE2) receptor proteins to study the endocytosis of a whole coronavirus under physiologically relevant spatial and temporal scales. We first conducted all-atom explicit-solvent molecular dynamics simulations of the recently characterized structures of spike and ACE2 proteins. We then established coarse-grained models using the shape-based coarse-graining approach based on the protein crystal structures and extracted the force field parameters from the all-atom simulation trajectories. To further analyze the coarse-grained models, we carried out normal mode analysis of the coarse-grained models to refine the force field parameters by matching the fluctuations of the internal coordinates with the original all-atom simulations. Finally, we demonstrated the capability of these coarse-grained models by simulating the endocytosis of a whole coronavirus through the host cell membrane. We embedded the coarse-grained models of spikes on the surface of the virus envelope and anchored ACE2 receptors on the host cell membrane, which is modeled using a one-particle-thick lipid bilayer model. The coarse-grained simulations show the spike proteins adopt bent configurations due to their unique flexibility during their interaction with the ACE2 receptors, which makes it easier for them to attach to the host cell membrane than rigid spikes. 
    more » « less
  3. Molecular simulations of biomacromolecules that assemble into multimeric complexes remain a challenge due to computationally inaccessible length and time scales. Low-resolution and implicit-solvent coarse-grained modeling approaches using traditional nonbonded interactions (both pairwise and spherically isotropic) have been able to partially address this gap. However, these models may fail to capture the complex anisotropic interactions present at macromolecular interfaces unless higher-order interaction potentials are incorporated at the expense of the computational cost. In this work, we introduce an alternate and systematic approach to represent directional interactions at protein–protein interfaces by using virtual sites restricted to pairwise interactions. We show that virtual site interaction parameters can be optimized within a relative entropy minimization framework by using only information from known statistics between coarse-grained sites. We compare our virtual site models to traditional coarse-grained models using two case studies of multimeric protein assemblies and find that the virtual site models predict pairwise correlations with higher fidelity and, more importantly, assembly behavior that is morphologically consistent with experiments. Our study underscores the importance of anisotropic interaction representations and paves the way for more accurate yet computationally efficient coarse-grained simulations of macromolecular assembly in future research. 
    more » « less
  4. Stochastic dynamics, such as molecular dynamics, are important in many scientific applications. However, summarizing and analyzing the results of such simulations is often challenging due to the high dimension in which simulations are carried out and, consequently, due to the very large amount of data that are typically generated. Coarse graining is a popular technique for addressing this problem by providing compact and expressive representations. Coarse graining, however, potentially comes at the cost of accuracy, as dynamical information is, in general, lost when projecting the problem in a lower-dimensional space. This article shows how to eliminate coarse-graining error using two key ideas. First, we represent coarse-grained dynamics as a Markov renewal process. Second, we outline a data-driven, non-parametric Mori–Zwanzig approach for computing jump times of the renewal process. Numerical tests on a small protein illustrate the method. 
    more » « less
  5. Abstract Coarse graining techniques play an essential role in accelerating molecular simulations of systems with large length and time scales. Theoretically grounded bottom-up models are appealing due to their thermodynamic consistency with the underlying all-atom models. In this direction, machine learning approaches hold great promise to fitting complex many-body data. However, training models may require collection of large amounts of expensive data. Moreover, quantifying trained model accuracy is challenging, especially in cases of non-trivial free energy configurations, where training data may be sparse. We demonstrate a path towards uncertainty-aware models of coarse grained free energy surfaces. Specifically, we show that principled Bayesian model uncertainty allows for efficient data collection through an on-the-fly active learning framework and opens the possibility of adaptive transfer of models across different chemical systems. Uncertainties also characterize models’ accuracy of free energy predictions, even when training is performed only on forces. This work helps pave the way towards efficient autonomous training of reliable and uncertainty aware many-body machine learned coarse grain models. 
    more » « less