The process of self-assembly of biomolecules underlies the formation of macromolecular assemblies, biomolecular materials and protein folding, and thereby is critical in many disciplines and related applications. This process typically spans numerous spatiotemporal scales and hence, is well suited for scientific interrogation via coarse-grained (CG) models used in conjunction with a suitable computational approach. This perspective provides a discussion on different coarse-graining approaches which have been used to develop CG models that resolve the process of self-assembly of biomolecules.
more »
« less
Constructing coarse-grained models with physics-guided Gaussian process regression
Coarse-grained models describe the macroscopic mean response of a process at large scales, which derives from stochastic processes at small scales. Common examples include accounting for velocity fluctuations in a turbulent fluid flow model and cloud evolution in climate models. Most existing techniques for constructing coarse-grained models feature ill-defined parameters whose values are arbitrarily chosen (e.g., a window size), are narrow in their applicability (e.g., only applicable to time series or spatial data), or cannot readily incorporate physics information. Here, we introduce the concept of physics-guided Gaussian process regression as a machine-learning-based coarse-graining technique that is broadly applicable and amenable to input from known physics-based relationships. Using a pair of case studies derived from molecular dynamics simulations, we demonstrate the attractive properties and superior performance of physics-guided Gaussian processes for coarse-graining relative to prevalent benchmarks. The key advantage of Gaussian-process-based coarse-graining is its ability to seamlessly integrate data-driven and physics-based information.
more »
« less
- Award ID(s):
- 2034074
- PAR ID:
- 10588125
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- APL Machine Learning
- Volume:
- 2
- Issue:
- 2
- ISSN:
- 2770-9019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Stochastic dynamics, such as molecular dynamics, are important in many scientific applications. However, summarizing and analyzing the results of such simulations is often challenging due to the high dimension in which simulations are carried out and, consequently, due to the very large amount of data that are typically generated. Coarse graining is a popular technique for addressing this problem by providing compact and expressive representations. Coarse graining, however, potentially comes at the cost of accuracy, as dynamical information is, in general, lost when projecting the problem in a lower-dimensional space. This article shows how to eliminate coarse-graining error using two key ideas. First, we represent coarse-grained dynamics as a Markov renewal process. Second, we outline a data-driven, non-parametric Mori–Zwanzig approach for computing jump times of the renewal process. Numerical tests on a small protein illustrate the method.more » « less
-
Physics-based, atom-centered machine learning (ML) representations have been instrumental to the effective integration of ML within the atomistic simulation community. Many of these representations build off the idea of atoms as having spherical, or isotropic, interactions. In many communities, there is often a need to represent groups of atoms, either to increase the computational efficiency of simulation via coarse-graining or to understand molecular influences on system behavior. In such cases, atom-centered representations will have limited utility, as groups of atoms may not be well-approximated as spheres. In this work, we extend the popular Smooth Overlap of Atomic Positions (SOAP) ML representation for systems consisting of non-spherical anisotropic particles or clusters of atoms. We show the power of this anisotropic extension of SOAP, which we deem AniSOAP, in accurately characterizing liquid crystal systems and predicting the energetics of Gay–Berne ellipsoids and coarse-grained benzene crystals. With our study of these prototypical anisotropic systems, we derive fundamental insights on how molecular shape influences mesoscale behavior and explain how to reincorporate important atom–atom interactions typically not captured by coarse-grained models. Moving forward, we propose AniSOAP as a flexible, unified framework for coarse-graining in complex, multiscale simulation.more » « less
-
null (Ed.)We developed coarse-grained models of spike proteins in SARS-CoV-2 coronavirus and angiotensin-converting enzyme 2 (ACE2) receptor proteins to study the endocytosis of a whole coronavirus under physiologically relevant spatial and temporal scales. We first conducted all-atom explicit-solvent molecular dynamics simulations of the recently characterized structures of spike and ACE2 proteins. We then established coarse-grained models using the shape-based coarse-graining approach based on the protein crystal structures and extracted the force field parameters from the all-atom simulation trajectories. To further analyze the coarse-grained models, we carried out normal mode analysis of the coarse-grained models to refine the force field parameters by matching the fluctuations of the internal coordinates with the original all-atom simulations. Finally, we demonstrated the capability of these coarse-grained models by simulating the endocytosis of a whole coronavirus through the host cell membrane. We embedded the coarse-grained models of spikes on the surface of the virus envelope and anchored ACE2 receptors on the host cell membrane, which is modeled using a one-particle-thick lipid bilayer model. The coarse-grained simulations show the spike proteins adopt bent configurations due to their unique flexibility during their interaction with the ACE2 receptors, which makes it easier for them to attach to the host cell membrane than rigid spikes.more » « less
-
Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of Aβ1–40, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.more » « less