skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 21, 2026

Title: Efficient exploration of reaction pathways using reaction databases and active learning
The fast and accurate simulation of chemical reactions is a major goal of computational chemistry. Recently, the pursuit of this goal has been aided by machine learning interatomic potentials (MLIPs), which provide energies and forces at quantum mechanical accuracy but at a fraction of the cost of the reference quantum mechanical calculations. Assembling the training set of relevant configurations is key to building the MLIP. Here, we demonstrate two approaches to training reactive MLIPs based on reaction pathway information. One approach exploits reaction datasets containing reactant, product, and transition state structures. Using an SN2 reaction dataset, we accurately locate reaction pathways and transition state geometries of up to 170 unseen reactions. In another approach, which does not depend on data availability, we present an efficient active learning procedure that yields an accurate MLIP and converged minimum energy path given only the reaction end point structures, avoiding quantum mechanics driven reaction pathway search at any stage of training set construction. We demonstrate this procedure on an SN2 reaction in the gas phase and with a small number of solvating water molecules, predicting reaction barriers within 20 meV of the reference quantum chemistry method. We then apply the active learning procedure on a more complex reaction involving a nucleophilic aromatic substitution and proton transfer, comparing the results against the reactive ReaxFF force field. Our active learning procedure, in addition to rapidly finding reaction paths for individual reactions, provides an approach to building large reaction path databases for training transferable reactive machine learning potentials.  more » « less
Award ID(s):
2306042
PAR ID:
10628241
Author(s) / Creator(s):
; ; ;
Corporate Creator(s):
Editor(s):
Lian, T
Publisher / Repository:
AIP
Date Published:
Journal Name:
The Journal of Chemical Physics
Edition / Version:
1
Volume:
162
Issue:
11
ISSN:
0021-9606
Subject(s) / Keyword(s):
Quantum chemistry, Density functional theory, Potential energy surfaces, Machine learning, Interatomic potentials, Gas phase, Reaction mechanisms, SN2 reaction, Chemical reaction dynamics, Transition state
Format(s):
Medium: X Size: 2.7MB Other: PDF
Size(s):
2.7MB
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation. 
    more » « less
  2. The rapid development and large body of literature on machine learning interatomic potentials (MLIPs) can make it difficult to know how to proceed for researchers who are not experts but wish to use these tools. The spirit of this review is to help such researchers by serving as a practical, accessible guide to the state-of-the-art in MLIPs. This review paper covers a broad range of topics related to MLIPs, including (i) central aspects of how and why MLIPs are enablers of many exciting advancements in molecular modeling, (ii) the main underpinnings of different types of MLIPs, including their basic structure and formalism, (iii) the potentially transformative impact of universal MLIPs for both organic and inorganic systems, including an overview of the most recent advances, capabilities, downsides, and potential applications of this nascent class of MLIPs, (iv) a practical guide for estimating and understanding the execution speed of MLIPs, including guidance for users based on hardware availability, type of MLIP used, and prospective simulation size and time, (v) a manual for what MLIP a user should choose for a given application by considering hardware resources, speed requirements, energy and force accuracy requirements, as well as guidance for choosing pre-trained potentials or fitting a new potential from scratch, (vi) discussion around MLIP infrastructure, including sources of training data, pre-trained potentials, and hardware resources for training, (vii) summary of some key limitations of present MLIPs and current approaches to mitigate such limitations, including methods of including long-range interactions, handling magnetic systems, and treatment of excited states, and finally (viii) we finish with some more speculative thoughts on what the future holds for the development and application of MLIPs over the next 3–10+ years. 
    more » « less
  3. Abstract Machine learning interatomic potentials (MLIPs) are a promising technique for atomic modeling. While small errors are widely reported for MLIPs, an open concern is whether MLIPs can accurately reproduce atomistic dynamics and related physical properties in molecular dynamics (MD) simulations. In this study, we examine the state-of-the-art MLIPs and uncover several discrepancies related to atom dynamics, defects, and rare events (REs), compared to ab initio methods. We find that low averaged errors by current MLIP testing are insufficient, and develop quantitative metrics that better indicate the accurate prediction of atomic dynamics by MLIPs. The MLIPs optimized by the RE-based evaluation metrics are demonstrated to have improved prediction in multiple properties. The identified errors, the evaluation metrics, and the proposed process of developing such metrics are general to MLIPs, thus providing valuable guidance for future testing and improvements of accurate and reliable MLIPs for atomistic modeling. 
    more » « less
  4. Abstract Machine learning interatomic potential (MLIP) has been widely adopted for atomistic simulations. While errors and discrepancies for MLIPs have been reported, a comprehensive examination of the MLIPs’ performance over a broad spectrum of material properties has been lacking. This study introduces an analysis process comprising model sampling, benchmarking, error evaluations, and multi-dimensional statistical analyses on an ensemble of MLIPs for prediction errors over a diverse range of properties. By carrying out this analysis on 2300 MLIP models based on six different MLIP types, several properties that pose challenges for the MLIPs to achieve small errors are identified. The Pareto front analyses on two or more properties reveal the trade-offs in different properties of MLIPs, underscoring the difficulties of achieving low errors for a large number of properties simultaneously. Furthermore, we propose correlation graph analyses to characterize the error performances of MLIPs and to select the representative properties for predicting other property errors. This analysis process on a large dataset of MLIP models sheds light on the underlying complexities of MLIP performance, offering crucial guidance for the future development of MLIPs with improved predictive accuracy across an array of material properties. 
    more » « less
  5. The calcium monofluoride (CaF) molecule has emerged as a promising candidate for precision measurements, quantum simulation, and ultracold chemistry experiments. Inelastic and reactive collisions of laser cooled CaF molecules in optical tweezers have recently been reported and collisions of cold Li atoms with CaF are of current experimental interest. In this paper, we report ab initio electronic structure and full-dimensional quantum dynamical calculations of the Li + CaF → LiF + Ca chemical reaction. The electronic structure calculations are performed using the internally contracted multi-reference configuration-interaction method with Davidson correction (MRCI + Q). An analytic fit of the interaction energies is obtained using a many-body expansion method. A coupled-channel quantum reactive scattering approach implemented in hyperspherical coordinates is adopted for the scattering calculations under cold conditions. Results show that the Li + CaF reaction populates several low-lying vibrational levels and many rotational levels of the product LiF molecule and that the reaction is inefficient in the 1–100 mK regime allowing sympathetic cooling of CaF by collisions with cold Li atoms. 
    more » « less