skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 16 until 2:00 AM ET on Saturday, May 17 due to maintenance. We apologize for the inconvenience.


Title: Arbitrarily accurate, nonparametric coarse graining with Markov renewal processes and the Mori–Zwanzig formulation
Stochastic dynamics, such as molecular dynamics, are important in many scientific applications. However, summarizing and analyzing the results of such simulations is often challenging due to the high dimension in which simulations are carried out and, consequently, due to the very large amount of data that are typically generated. Coarse graining is a popular technique for addressing this problem by providing compact and expressive representations. Coarse graining, however, potentially comes at the cost of accuracy, as dynamical information is, in general, lost when projecting the problem in a lower-dimensional space. This article shows how to eliminate coarse-graining error using two key ideas. First, we represent coarse-grained dynamics as a Markov renewal process. Second, we outline a data-driven, non-parametric Mori–Zwanzig approach for computing jump times of the renewal process. Numerical tests on a small protein illustrate the method.  more » « less
Award ID(s):
2111277
PAR ID:
10557080
Author(s) / Creator(s):
; ;
Publisher / Repository:
AIP Publishing
Date Published:
Journal Name:
AIP Advances
Volume:
13
Issue:
9
ISSN:
2158-3226
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Coarse-grained models describe the macroscopic mean response of a process at large scales, which derives from stochastic processes at small scales. Common examples include accounting for velocity fluctuations in a turbulent fluid flow model and cloud evolution in climate models. Most existing techniques for constructing coarse-grained models feature ill-defined parameters whose values are arbitrarily chosen (e.g., a window size), are narrow in their applicability (e.g., only applicable to time series or spatial data), or cannot readily incorporate physics information. Here, we introduce the concept of physics-guided Gaussian process regression as a machine-learning-based coarse-graining technique that is broadly applicable and amenable to input from known physics-based relationships. Using a pair of case studies derived from molecular dynamics simulations, we demonstrate the attractive properties and superior performance of physics-guided Gaussian processes for coarse-graining relative to prevalent benchmarks. The key advantage of Gaussian-process-based coarse-graining is its ability to seamlessly integrate data-driven and physics-based information. 
    more » « less
  2. Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of Aβ1–40, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs. 
    more » « less
  3. null (Ed.)
    Modeling a high-dimensional Hamiltonian system in reduced dimensions with respect to coarse-grained (CG) variables can greatly reduce computational cost and enable efficient bottom-up prediction of main features of the system for many applications. However, it usually experiences significantly altered dynamics due to loss of degrees of freedom upon coarse-graining. To establish CG models that can faithfully preserve dynamics, previous efforts mainly focused on equilibrium systems. In contrast, various soft matter systems are known to be out of equilibrium. Therefore, the present work concerns non-equilibrium systems and enables accurate and efficient CG modeling that preserves non-equilibrium dynamics and is generally applicable to any non-equilibrium process and any observable of interest. To this end, the dynamic equation of a CG variable is built in the form of the non-stationary generalized Langevin equation (nsGLE), where the two-time memory kernel is determined from the data of the auto-correlation function of the observable of interest. By embedding the nsGLE in an extended dynamics framework, the nsGLE can be solved efficiently to predict the non-equilibrium dynamics of the CG variable. To prove and exploit the equivalence of the nsGLE and extended dynamics, the memory kernel is parameterized in a two-time exponential expansion. A data-driven hybrid optimization process is proposed for the parameterization, which integrates the differential-evolution method with the Levenberg–Marquardt algorithm to efficiently tackle a non-convex and high-dimensional optimization problem. 
    more » « less
  4. The integral equation coarse-graining (IECG) approach is a promising high-level coarse-graining (CG) method for polymer melts, with variable resolution from soft spheres to multi CG sites, which preserves the structural and thermodynamical consistencies with the related atomistic simulations. When compared to the atomistic description, the procedure of coarse-graining results in smoother free energy surfaces, longer-ranged potentials, a decrease in the number of interaction sites for a given polymer, and more. Because these changes have competing effects on the computational efficiency of the CG model, care needs to be taken when studying the effect of coarse-graining on the computational speed-up in CG molecular dynamics simulations. For instance, treatment of long-range CG interactions requires the selection of cutoff distances that include the attractive part of the effective CG potential and force. In particular, we show how the complex nature of the range and curvature of the effective CG potential, the selection of a suitable CG timestep, the choice of the cutoff distance, the molecular dynamics algorithms, and the smoothness of the CG free energy surface affect the efficiency of IECG simulations. By direct comparison with the atomistic simulations of relatively short chain polymer melts, we find that the overall computational efficiency is highest for the highest level of CG (soft spheres), with an overall improvement of the computational efficiency being about 10 6 –10 8 for various CG levels/resolutions. Therefore, the IECG method can have important applications in molecular dynamics simulations of polymeric systems. Finally, making use of the standard spatial decomposition algorithm, the parallel scalability of the IECG simulations for various levels of CG is presented. Optimal parallel scaling is observed for a reasonably large number of processors. Although this study is performed using the IECG approach, its results on the relation between the level of CG and the computational efficiency are general and apply to any properly-constructed CG model. 
    more » « less
  5. Abstract Coarse graining techniques play an essential role in accelerating molecular simulations of systems with large length and time scales. Theoretically grounded bottom-up models are appealing due to their thermodynamic consistency with the underlying all-atom models. In this direction, machine learning approaches hold great promise to fitting complex many-body data. However, training models may require collection of large amounts of expensive data. Moreover, quantifying trained model accuracy is challenging, especially in cases of non-trivial free energy configurations, where training data may be sparse. We demonstrate a path towards uncertainty-aware models of coarse grained free energy surfaces. Specifically, we show that principled Bayesian model uncertainty allows for efficient data collection through an on-the-fly active learning framework and opens the possibility of adaptive transfer of models across different chemical systems. Uncertainties also characterize models’ accuracy of free energy predictions, even when training is performed only on forces. This work helps pave the way towards efficient autonomous training of reliable and uncertainty aware many-body machine learned coarse grain models. 
    more » « less