Bottom-up coarse-grained (CG) molecular dynamics models are parameterized using complex effective Hamiltonians. These models are typically optimized to approximate high dimensional data from atomistic simulations. However, human validation of these models is often limited to low dimensional statistics that do not necessarily differentiate between the CG model and said atomistic simulations. We propose that classification can be used to variationally estimate high dimensional error and that explainable machine learning can help convey this information to scientists. This approach is demonstrated using Shapley additive explanations and two CG protein models. This framework may also be valuable for ascertaining whether allosteric effects at the atomistic level are accurately propagated to a CG model.
more »
« less
Chemically realistic coarse-grained models for polyelectrolyte solutions
Polyelectrolyte solutions are of considerable scientific and practical importance. One of the most widely studied polymer is polystyrene sulfonate (PSS), which has a hydrophobic backbone with pendant charged groups. A polycation with similar chemical structure is poly(vinyl benzyltri methyl) ammonium (PVBTMA). In this work, we develop coarse-grained (CG) models for PSS and PVBTMA with explicit CG water and with sodium and chloride counterions, respectively. We benchmark the CG models via a comparison with atomistic simulations for single chains. We find that the choice of the topology and the partial charge distribution of the CG model, both play a crucial role in the ability of the CG model to reproduce results from atomistic simulations. There are dramatic consequences, e.g., collapse of polyions, with injudicious choices of the local charge distribution. The polyanions and polycations exhibit a similar conformational and dynamical behavior, suggesting that the sign of the polyion charge does not play a significant role.
more »
« less
- Award ID(s):
- 1856595
- PAR ID:
- 10363357
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- The Journal of Chemical Physics
- Volume:
- 156
- Issue:
- 9
- ISSN:
- 0021-9606
- Page Range / eLocation ID:
- Article No. 094902
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Efficient sampling of the conformational space is essential for quantitative simulations of proteins. The multiscale enhanced sampling (MSES) method accelerates atomistic sampling by coupling it to a coarse‐grained (CG) simulation. Bias from coupling to the CG model is removed using Hamiltonian replica exchange, such that one could benefit simultaneously from the high accuracy of atomistic models and fast dynamics of CG ones. Here, we extend MSES to allow independent control of the effective temperatures of atomistic and CG simulations, by directly scaling the atomistic and CG Hamiltonians. The new algorithm, named MSES with independent tempering (MSES‐IT), supports more sophisticated Hamiltonian and temperature replica exchange protocols to further improve the sampling efficiency. Using a small but nontrivial β‐hairpin, we show that setting the effective temperature of CG model in all conditions to its melting temperature maximizes structural transition rates at the CG level and promotes more efficient replica exchange and diffusion in the condition space. As the result, MSES‐IT drive faster reversible transitions at the atomic level and leads to significant improvement in generating converged conformational ensembles compared to the original MSES scheme.more » « less
-
One essential goal of constructing coarse-grained molecular dynamics (CGMD) models is to accurately predict nonequilibrium processes beyond the atomistic scale. While a CG model can be constructed by projecting the full dynamics onto a set of resolved variables, the dynamics of the CG variables can recover the full dynamics only when the conditional distribution of the unresolved variables is close to the one associated with the particular projection operator. In particular, the model's applicability to various nonequilibrium processes is generally unwarranted due to the inconsistency in the conditional distribution. Here, we present a data-driven approach for constructing CGMD models that retain certain generalization ability for nonequilibrium processes. Unlike the conventional CG models based on preselected CG variables (e.g., the center of mass), the present CG model seeks a set of auxiliary CG variables similar to the time-lagged independent component analysis to maximize the velocity correlation. This effectively minimizes the entropy contribution of unresolved variables and ensures the distribution under a broad range of nonequilibrium conditions approaches the one under equilibrium. Numerical results of a polymer melt system demonstrate the significance of this broadly overlooked metric for the model's generalization ability, and the effectiveness of the present CG model for predicting the complex viscoelastic responses under various nonequilibrium flows.more » « less
-
The integral equation coarse-graining (IECG) approach is a promising high-level coarse-graining (CG) method for polymer melts, with variable resolution from soft spheres to multi CG sites, which preserves the structural and thermodynamical consistencies with the related atomistic simulations. When compared to the atomistic description, the procedure of coarse-graining results in smoother free energy surfaces, longer-ranged potentials, a decrease in the number of interaction sites for a given polymer, and more. Because these changes have competing effects on the computational efficiency of the CG model, care needs to be taken when studying the effect of coarse-graining on the computational speed-up in CG molecular dynamics simulations. For instance, treatment of long-range CG interactions requires the selection of cutoff distances that include the attractive part of the effective CG potential and force. In particular, we show how the complex nature of the range and curvature of the effective CG potential, the selection of a suitable CG timestep, the choice of the cutoff distance, the molecular dynamics algorithms, and the smoothness of the CG free energy surface affect the efficiency of IECG simulations. By direct comparison with the atomistic simulations of relatively short chain polymer melts, we find that the overall computational efficiency is highest for the highest level of CG (soft spheres), with an overall improvement of the computational efficiency being about 10 6 –10 8 for various CG levels/resolutions. Therefore, the IECG method can have important applications in molecular dynamics simulations of polymeric systems. Finally, making use of the standard spatial decomposition algorithm, the parallel scalability of the IECG simulations for various levels of CG is presented. Optimal parallel scaling is observed for a reasonably large number of processors. Although this study is performed using the IECG approach, its results on the relation between the level of CG and the computational efficiency are general and apply to any properly-constructed CG model.more » « less
-
Coarse-grained molecular dynamics (CGMD) simulations address lengthscales and timescales that are critical to many chemical and material applications. Nevertheless, contemporary CGMD modeling is relatively bespoke and there are no black-box CGMD methodologies available that could play a comparable role in discovery applications that density functional theory plays for electronic structure. This gap might be filled by machine learning (ML)-based CGMD potentials that simplify model development, but these methods are still in their early stages and have yet to demonstrate a significant advantage over existing physics-based CGMD methods. Here, we explore the potential of Δ-learning models to leverage the advantages of these two approaches. This is implemented by using ML-based potentials to learn the difference between the target CGMD variable and the predictions of physics-based potentials. The Δ-models are benchmarked against the baseline models in reproducing on-target and off-target atomistic properties as a function of CG resolution, mapping operator, and system topology. The Δ-models outperform the reference ML-only CGMD models in nearly all scenarios. In several cases, the ML-only models manage to minimize training errors while still producing qualitatively incorrect dynamics, which is corrected by the Δ-models. Given their negligible added cost, Δ-models provide essentially free gains over their ML-only counterparts. Nevertheless, an unexpected finding is that neither the Δ-learning models nor the ML-only models significantly outperform the elementary pairwise models in reproducing atomistic properties. This fundamental failure is attributed to the relatively large irreducible force errors associated with coarse-graining that produces little benefit from using more complex potentials.more » « less
An official website of the United States government
