Bottom-up coarse-grained (CG) molecular dynamics models are parameterized using complex effective Hamiltonians. These models are typically optimized to approximate high dimensional data from atomistic simulations. However, human validation of these models is often limited to low dimensional statistics that do not necessarily differentiate between the CG model and said atomistic simulations. We propose that classification can be used to variationally estimate high dimensional error and that explainable machine learning can help convey this information to scientists. This approach is demonstrated using Shapley additive explanations and two CG protein models. This framework may also be valuable for ascertaining whether allosteric effects at the atomistic level are accurately propagated to a CG model.
Efficient sampling of the conformational space is essential for quantitative simulations of proteins. The multiscale enhanced sampling (MSES) method accelerates atomistic sampling by coupling it to a coarse‐grained (CG) simulation. Bias from coupling to the CG model is removed using Hamiltonian replica exchange, such that one could benefit simultaneously from the high accuracy of atomistic models and fast dynamics of CG ones. Here, we extend MSES to allow independent control of the effective temperatures of atomistic and CG simulations, by directly scaling the atomistic and CG Hamiltonians. The new algorithm, named MSES with independent tempering (MSES‐IT), supports more sophisticated Hamiltonian and temperature replica exchange protocols to further improve the sampling efficiency. Using a small but nontrivial β‐hairpin, we show that setting the effective temperature of CG model in all conditions to its melting temperature maximizes structural transition rates at the CG level and promotes more efficient replica exchange and diffusion in the condition space. As the result, MSES‐IT drive faster reversible transitions at the atomic level and leads to significant improvement in generating converged conformational ensembles compared to the original MSES scheme.
more » « less- Award ID(s):
- 1817332
- NSF-PAR ID:
- 10454252
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Journal of Computational Chemistry
- Volume:
- 42
- Issue:
- 5
- ISSN:
- 0192-8651
- Page Range / eLocation ID:
- p. 358-364
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The integral equation coarse-graining (IECG) approach is a promising high-level coarse-graining (CG) method for polymer melts, with variable resolution from soft spheres to multi CG sites, which preserves the structural and thermodynamical consistencies with the related atomistic simulations. When compared to the atomistic description, the procedure of coarse-graining results in smoother free energy surfaces, longer-ranged potentials, a decrease in the number of interaction sites for a given polymer, and more. Because these changes have competing effects on the computational efficiency of the CG model, care needs to be taken when studying the effect of coarse-graining on the computational speed-up in CG molecular dynamics simulations. For instance, treatment of long-range CG interactions requires the selection of cutoff distances that include the attractive part of the effective CG potential and force. In particular, we show how the complex nature of the range and curvature of the effective CG potential, the selection of a suitable CG timestep, the choice of the cutoff distance, the molecular dynamics algorithms, and the smoothness of the CG free energy surface affect the efficiency of IECG simulations. By direct comparison with the atomistic simulations of relatively short chain polymer melts, we find that the overall computational efficiency is highest for the highest level of CG (soft spheres), with an overall improvement of the computational efficiency being about 10 6 –10 8 for various CG levels/resolutions. Therefore, the IECG method can have important applications in molecular dynamics simulations of polymeric systems. Finally, making use of the standard spatial decomposition algorithm, the parallel scalability of the IECG simulations for various levels of CG is presented. Optimal parallel scaling is observed for a reasonably large number of processors. Although this study is performed using the IECG approach, its results on the relation between the level of CG and the computational efficiency are general and apply to any properly-constructed CG model.more » « less
-
This paper series aims to establish a complete correspondence between fine-grained (FG) and coarse-grained (CG) dynamics by way of excess entropy scaling (introduced in Paper I). While Paper II successfully captured translational motions in CG systems using a hard sphere mapping, the absence of rotational motions in single-site CG models introduces differences between FG and CG dynamics. In this third paper, our objective is to faithfully recover atomistic diffusion coefficients from CG dynamics by incorporating rotational dynamics. By extracting FG rotational diffusion, we unravel, for the first time reported to our knowledge, a universality in excess entropy scaling between the rotational and translational diffusion. Once the missing rotational dynamics are integrated into the CG translational dynamics, an effective translation-rotation coupling becomes essential. We propose two different approaches for estimating this coupling parameter: the rough hard sphere theory with acentric factor (temperature-independent) or the rough Lennard-Jones model with CG attractions (temperature-dependent). Altogether, we demonstrate that FG diffusion coefficients can be recovered from CG diffusion coefficients by (1) incorporating “entropy-free” rotational diffusion with translation-rotation coupling and (2) recapturing the missing entropy. Our findings shed light on the fundamental relationship between FG and CG dynamics in molecular fluids.more » « less
-
Abstract Understanding how proteins fold has remained a problem of great interest in biophysical research. Atomistic computer simulations using physics-based force fields can provide important insights on the interplay of different interactions and energetics and their roles in governing the folding thermodynamics and mechanism. In particular, generalized Born (GB)-based implicit solvent force fields can be optimized to provide an appropriate balance between solvation and intramolecular interactions and successfully recapitulate experimental conformational equilibria for a set of helical and β-hairpin peptides. Here, we further demonstrate that key thermodynamic properties and their temperature dependence obtained from replica exchange molecular dynamics simulations of these peptides are in quantitative agreement with experimental results. Useful lessons can be learned on how the interplay of entropy and sequentially long-range interactions governs the mechanism and cooperativity of folding. These results highlight the great potential of high-quality implicit solvent force fields for studying protein folding and large-scale conformational transitions.
-
Abstract Tropoelastin is the dominant building block of elastic fibers, which form a major component of the extracellular matrix, providing structural support to tissues and imbuing them with elasticity and resilience. Recently, the atomistic structure of human tropoelastin is described, obtained through accelerated sampling via replica exchange molecular dynamics simulations. Here, principal component analysis is used to consider the ensemble of structures accessible to tropoelastin at body temperature (37 °C) at which tropoelastin naturally self‐assembles into aggregated coacervates. These coacervates are relevant because they are an essential intermediate assembly stage, where tropoelastin molecules are then cross‐linked at lysine residues and integrated into growing elastic fibers. It is found that the ensemble preserves the canonical tropoelastin structure with an extended molecular body flanked by two protruding legs, and identifies variations in specific domain positioning within this global shape. Furthermore, it is found that lysine residues show a large variation in their location on the tropoelastin molecule compared with other residues. It is hypothesized that this perturbation of the lysines increases their accessibility and enhances cross‐linking. Finally, the principal component modes are extracted to describe the range of tropoelastin's conformational fluctuation to validate tropoelastin's scissor‐twist motion that was predicted earlier.