skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Ultra-fast interpretable machine-learning potentials
All-atom dynamics simulations are an indispensable quantitative tool in physics, chemistry, and materials science, but large systems and long simulation times remain challenging due to the trade-off between computational efficiency and predictive accuracy. To address this challenge, we combine effective two- and three-body potentials in a cubic B-spline basis with regularized linear regression to obtain machine-learning potentials that are physically interpretable, sufficiently accurate for applications, as fast as the fastest traditional empirical potentials, and two to four orders of magnitude faster than state-of-the-art machine-learning potentials. For data from empirical potentials, we demonstrate the exact retrieval of the potential. For data from density functional theory, the predicted energies, forces, and derived properties, including phonon spectra, elastic constants, and melting points, closely match those of the reference method. The introduced potentials might contribute towards accurate all-atom dynamics simulations of large atomistic systems over long-time scales.  more » « less
Award ID(s):
2118718
PAR ID:
10471641
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Nature
Date Published:
Journal Name:
npj Computational Materials
Volume:
9
Issue:
1
ISSN:
2057-3960
Page Range / eLocation ID:
162
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics. 
    more » « less
  2. Histone modifications play a crucial role in regulating chromatin architecture and gene expression. Here we develop a multiscale model for incorporating methylation in our nucleosome-resolution physics-based chromatin model to investigate the mechanisms by which H3K9 and H3K27 trimethylation (H3K9me3 and H3K27me3) influence chromatin structure and gene regulation. We apply three types of energy terms for this purpose: short-range potentials are derived from all-atom molecular dynamics simulations of wildtype and methylated chromatosomes, which revealed subtle local changes; medium-range potentials are derived by incorporating contacts between HP1 and nucleosomes modified by H3K9me3, to incorporate experimental results of enhanced contacts for short chromatin fibers (12 nucleosomes); for long-range interactions we identify H3K9me3- and H3K27me3-associated contacts based on Hi-C maps with a machine learning approach. These combined multiscale effects can model methylation as a first approximation in our mesoscale chromatin model, and applications to gene systems offer new insights into the epigenetic regulation of genomes mediated by H3K9me3 and H3K27me3. 
    more » « less
  3. Abstract We present PyXtal_FF—a package based on Python programming language—for developing machine learning potentials (MLPs). The aim of PyXtal_FF is to promote the application of atomistic simulations through providing several choices of atom-centered descriptors and machine learning regressions in one platform. Based on the given choice of descriptors (including the atom-centered symmetry functions, embedded atom density, SO4 bispectrum, and smooth SO3 power spectrum), PyXtal_FF can train MLPs with either generalized linear regression or neural network models, by simultaneously minimizing the errors of energy/forces/stress tensors in comparison with the data from ab-initio simulations. The trained MLP model from PyXtal_FF is interfaced with the Atomic Simulation Environment (ASE) package, which allows different types of light-weight simulations such as geometry optimization, molecular dynamics simulation, and physical properties prediction. Finally, we will illustrate the performance of PyXtal_FF by applying it to investigate several material systems, including the bulk SiO 2 , high entropy alloy NbMoTaW, and elemental Pt for general purposes. Full documentation of PyXtal_FF is available at https://pyxtal-ff.readthedocs.io . 
    more » « less
  4. Abstract Machine learning interatomic potentials (MLIPs) are a promising technique for atomic modeling. While small errors are widely reported for MLIPs, an open concern is whether MLIPs can accurately reproduce atomistic dynamics and related physical properties in molecular dynamics (MD) simulations. In this study, we examine the state-of-the-art MLIPs and uncover several discrepancies related to atom dynamics, defects, and rare events (REs), compared to ab initio methods. We find that low averaged errors by current MLIP testing are insufficient, and develop quantitative metrics that better indicate the accurate prediction of atomic dynamics by MLIPs. The MLIPs optimized by the RE-based evaluation metrics are demonstrated to have improved prediction in multiple properties. The identified errors, the evaluation metrics, and the proposed process of developing such metrics are general to MLIPs, thus providing valuable guidance for future testing and improvements of accurate and reliable MLIPs for atomistic modeling. 
    more » « less
  5. Fluctuations of protein three-dimensional structures and large-scale conformational transitions are crucial for the biological function of proteins and their complexes. Experimental studies of such phenomena remain very challenging and therefore molecular modeling can be a good alternative or a valuable supporting tool for the investigation of large molecular systems and long-time events. In this minireview, we present two alternative approaches to the coarse-grained (CG) modeling of dynamic properties of protein systems. We discuss two CG representations of polypeptide chains used for Monte Carlo dynamics simulations of protein local dynamics and conformational transitions, and highly simplified structure-based elastic network models of protein flexibility. In contrast to classical all-atom molecular dynamics, the modeling strategies discussed here allow the quite accurate modeling of much larger systems and longer-time dynamic phenomena. We briefly describe the main features of these models and outline some of their applications, including modeling of near-native structure fluctuations, sampling of large regions of the protein conformational space, or possible support for the structure prediction of large proteins and their complexes. 
    more » « less