skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential
Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation.  more » « less
Award ID(s):
2102461
PAR ID:
10526207
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Chemistry
Date Published:
Journal Name:
Nature Chemistry
Volume:
16
Issue:
5
ISSN:
1755-4330
Page Range / eLocation ID:
727 to 734
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Lian, T (Ed.)
    The fast and accurate simulation of chemical reactions is a major goal of computational chemistry. Recently, the pursuit of this goal has been aided by machine learning interatomic potentials (MLIPs), which provide energies and forces at quantum mechanical accuracy but at a fraction of the cost of the reference quantum mechanical calculations. Assembling the training set of relevant configurations is key to building the MLIP. Here, we demonstrate two approaches to training reactive MLIPs based on reaction pathway information. One approach exploits reaction datasets containing reactant, product, and transition state structures. Using an SN2 reaction dataset, we accurately locate reaction pathways and transition state geometries of up to 170 unseen reactions. In another approach, which does not depend on data availability, we present an efficient active learning procedure that yields an accurate MLIP and converged minimum energy path given only the reaction end point structures, avoiding quantum mechanics driven reaction pathway search at any stage of training set construction. We demonstrate this procedure on an SN2 reaction in the gas phase and with a small number of solvating water molecules, predicting reaction barriers within 20 meV of the reference quantum chemistry method. We then apply the active learning procedure on a more complex reaction involving a nucleophilic aromatic substitution and proton transfer, comparing the results against the reactive ReaxFF force field. Our active learning procedure, in addition to rapidly finding reaction paths for individual reactions, provides an approach to building large reaction path databases for training transferable reactive machine learning potentials. 
    more » « less
  2. Abstract Aerosol chemistry has broad relevance for climate and global public health. The role of interfacial phenomena in condensed‐phase aerosol reactions remains poorly understood. In this work, liquid drop formalisms are coupled with high‐pressure transition state theory to formulate an expression for predicting the size‐dependence of aerosol reaction rates and viscosity. Insights from high‐pressure synthesis studies suggest that accretion and cyclization reactions are accelerated in 3–10‐nm particles smaller than 10 nm. Reactions of peroxide, epoxide, furanoid, aldol, and carbonyl functional groups are accelerated by up to tenfold. Effective rate enhancements are ranked as: cycloadditions >> aldol reactions > epoxide reactions > Baeyer‐Villiger oxidation >> imidazole formation (which is inhibited). Some reactions are enabled by the elevated pressure in particles. Viscosity increases for organic liquids but decreases for viscous or solid particles. Results suggest that internal pressure is an important consideration in studies of the physics and chemical evolution of nanoparticles. 
    more » « less
  3. Abstract Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models. In chemistry, ML has been used to develop models for predicting molecular properties, for example quantum mechanics (QM) calculated potential energy surfaces and atomic charge models. The ANI-1x and ANI-1ccx ML-based general-purpose potentials for organic molecules were developed through active learning; an automated data diversification process. Here, we describe the ANI-1x and ANI-1ccx data sets. To demonstrate data diversity, we visualize it with a dimensionality reduction scheme, and contrast against existing data sets. The ANI-1x data set contains multiple QM properties from 5 M density functional theory calculations, while the ANI-1ccx data set contains 500 k data points obtained with an accurate CCSD(T)/CBS extrapolation. Approximately 14 million CPU core-hours were expended to generate this data. Multiple QM calculated properties for the chemical elements C, H, N, and O are provided: energies, atomic forces, multipole moments, atomic charges, etc. We provide this data to the community to aid research and development of ML models for chemistry. 
    more » « less
  4. Abstract Machine learning interatomic potential (MLIP) has been widely adopted for atomistic simulations. While errors and discrepancies for MLIPs have been reported, a comprehensive examination of the MLIPs’ performance over a broad spectrum of material properties has been lacking. This study introduces an analysis process comprising model sampling, benchmarking, error evaluations, and multi-dimensional statistical analyses on an ensemble of MLIPs for prediction errors over a diverse range of properties. By carrying out this analysis on 2300 MLIP models based on six different MLIP types, several properties that pose challenges for the MLIPs to achieve small errors are identified. The Pareto front analyses on two or more properties reveal the trade-offs in different properties of MLIPs, underscoring the difficulties of achieving low errors for a large number of properties simultaneously. Furthermore, we propose correlation graph analyses to characterize the error performances of MLIPs and to select the representative properties for predicting other property errors. This analysis process on a large dataset of MLIP models sheds light on the underlying complexities of MLIP performance, offering crucial guidance for the future development of MLIPs with improved predictive accuracy across an array of material properties. 
    more » « less
  5. Abstract Nonequilibrium phase transitions play a pivotal role in broad physical contexts, from condensed matter to cosmology. Tracking the formation of nonequilibrium phases in condensed matter requires a resolution of the long-range cooperativity on ultra-short timescales. Here, we study the spontaneous transformation of a charge-density wave in CeTe3from a stripe order into a bi-directional state inaccessible thermodynamically but is induced by intense laser pulses. With ≈100 fs resolution coherent electron diffraction, we capture the entire course of this transformation and show self-organization that defines a nonthermal critical point, unveiling the nonequilibrium energy landscape. We discuss the generation of instabilities by a swift interaction quench that changes the system symmetry preference, and the phase ordering dynamics orchestrated over a nonadiabatic timescale to allow new order parameter fluctuations to gain long-range correlations. Remarkably, the subsequent thermalization locks the remnants of the transient order into longer-lived topological defects for more than 2 ns. 
    more » « less