Abstract A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.
more »
« less
Ultra-fast interpretable machine-learning potentials
Abstract All-atom dynamics simulations are an indispensable quantitative tool in physics, chemistry, and materials science, but large systems and long simulation times remain challenging due to the trade-off between computational efficiency and predictive accuracy. To address this challenge, we combine effective two- and three-body potentials in a cubic B-spline basis with regularized linear regression to obtain machine-learning potentials that are physically interpretable, sufficiently accurate for applications, as fast as the fastest traditional empirical potentials, and two to four orders of magnitude faster than state-of-the-art machine-learning potentials. For data from empirical potentials, we demonstrate the exact retrieval of the potential. For data from density functional theory, the predicted energies, forces, and derived properties, including phonon spectra, elastic constants, and melting points, closely match those of the reference method. The introduced potentials might contribute towards accurate all-atom dynamics simulations of large atomistic systems over long-time scales.
more »
« less
- Award ID(s):
- 2118718
- PAR ID:
- 10453286
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- npj Computational Materials
- Volume:
- 9
- Issue:
- 1
- ISSN:
- 2057-3960
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We present PyXtal_FF—a package based on Python programming language—for developing machine learning potentials (MLPs). The aim of PyXtal_FF is to promote the application of atomistic simulations through providing several choices of atom-centered descriptors and machine learning regressions in one platform. Based on the given choice of descriptors (including the atom-centered symmetry functions, embedded atom density, SO4 bispectrum, and smooth SO3 power spectrum), PyXtal_FF can train MLPs with either generalized linear regression or neural network models, by simultaneously minimizing the errors of energy/forces/stress tensors in comparison with the data fromab-initiosimulations. The trained MLP model from PyXtal_FF is interfaced with the Atomic Simulation Environment (ASE) package, which allows different types of light-weight simulations such as geometry optimization, molecular dynamics simulation, and physical properties prediction. Finally, we will illustrate the performance of PyXtal_FF by applying it to investigate several material systems, including the bulk SiO2, high entropy alloy NbMoTaW, and elemental Pt for general purposes. Full documentation of PyXtal_FF is available athttps://pyxtal-ff.readthedocs.io.more » « less
-
Histone modifications play a crucial role in regulating chromatin architecture and gene expression. Here we develop a multiscale model for incorporating methylation in our nucleosome-resolution physics-based chromatin model to investigate the mechanisms by which H3K9 and H3K27 trimethylation (H3K9me3 and H3K27me3) influence chromatin structure and gene regulation. We apply three types of energy terms for this purpose: short-range potentials are derived from all-atom molecular dynamics simulations of wildtype and methylated chromatosomes, which revealed subtle local changes; medium-range potentials are derived by incorporating contacts between HP1 and nucleosomes modified by H3K9me3, to incorporate experimental results of enhanced contacts for short chromatin fibers (12 nucleosomes); for long-range interactions we identify H3K9me3- and H3K27me3-associated contacts based on Hi-C maps with a machine learning approach. These combined multiscale effects can model methylation as a first approximation in our mesoscale chromatin model, and applications to gene systems offer new insights into the epigenetic regulation of genomes mediated by H3K9me3 and H3K27me3.more » « less
-
Abstract Machine learning interatomic potentials (MLIPs) are a promising technique for atomic modeling. While small errors are widely reported for MLIPs, an open concern is whether MLIPs can accurately reproduce atomistic dynamics and related physical properties in molecular dynamics (MD) simulations. In this study, we examine the state-of-the-art MLIPs and uncover several discrepancies related to atom dynamics, defects, and rare events (REs), compared to ab initio methods. We find that low averaged errors by current MLIP testing are insufficient, and develop quantitative metrics that better indicate the accurate prediction of atomic dynamics by MLIPs. The MLIPs optimized by the RE-based evaluation metrics are demonstrated to have improved prediction in multiple properties. The identified errors, the evaluation metrics, and the proposed process of developing such metrics are general to MLIPs, thus providing valuable guidance for future testing and improvements of accurate and reliable MLIPs for atomistic modeling.more » « less
-
Abstract We present machine‐learning interatomic potentials (MLIPs) for simulations of Si–C–O–H compounds. The MLIPs are constructed from moment tensor potentials (MTPs) and were trained to a library of configurations that included polysiloxane structures, hypothetical crystalline and amorphous SiCOH structures, and trajectories of Si–C–O–H systems obtained via ab initio molecular dynamic (aiMD) simulations at elevated temperatures. Passive, active, and hybrid learning strategies were implemented to develop the MLIPs. The MLIPs reproduce vibrational properties of polymers and SiCOH structures obtained from aiMD simulations, thus providing a tool to identify chemical units and distinct structural characteristics through their vibrational properties. Simulations of the polymer‐to‐ceramic transformation show the development of mixed tetrahedra in SiCO ceramics and align with experimental observations. Million‐atom simulations for several nanoseconds highlight the precipitation of graphitic nanosheets from a carbon‐rich SiCO precursor. Atomistic simulations with the MLIPs deliver details of chemical reaction mechanisms during the pyrolysis of polysiloxanes, including methane abstraction and Kumada‐like rearrangements that transform the siloxane backbone. While the MLIPs still leave room for systematic improvement, they deliver simulations with “density functional theory (DFT)‐like” quality at low and high temperatures.more » « less
An official website of the United States government
