skip to main content

Title: Learning molecular potentials with neural networks

The potential energy of molecular species and their conformers can be computed with a wide range of computational chemistry methods, from molecular mechanics to ab initio quantum chemistry. However, the proper choice of the computational approach based on computational cost and reliability of calculated energies is a dilemma, especially for large molecules. This dilemma is proved to be even more problematic for studies that require hundreds and thousands of calculations, such as drug discovery. On the other hand, driven by their pattern recognition capabilities, neural networks started to gain popularity in the computational chemistry community. During the last decade, many neural network potentials have been developed to predict a variety of chemical information of different systems. Neural network potentials are proved to predict chemical properties with accuracy comparable to quantum mechanical approaches but with the cost approaching molecular mechanics calculations. As a result, the development of more reliable, transferable, and extensible neural network potentials became an attractive field of study for researchers. In this review, we outlined an overview of the status of current neural network potentials and strategies to improve their accuracy. We provide recent examples of studies that prove the applicability of these potentials. We also discuss the capabilities and shortcomings of the current models and the challenges and future aspects of their development and applications. It is expected that this review would provide guidance for the development of neural network potentials and the exploitation of their applicability.

This article is categorized under:

Data Science > Artificial Intelligence/Machine Learning

Molecular and Statistical Mechanics > Molecular Interactions

Software > Molecular Modeling

more » « less
Award ID(s):
Author(s) / Creator(s):
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
WIREs Computational Molecular Science
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Water/oxide interfaces are ubiquitous on earth and show significant influence on many chemical processes. For example, understanding water and solute adsorption as well as catalytic water splitting can help build better fuel cells and solar cells to overcome our looming energy crisis; the interaction between biomolecules and water/oxide interfaces is one hypothesis to explain the origin of life. However, knowledge in this area is still limited due to the difficulty of studying water/solid interfaces. As a result, research using increasingly sophisticated experimental techniques and computational simulations has been carried out in recent years. Although it is difficult for experimental techniques to provide detailed microscopic structural information, molecular dynamics (MD) simulations have satisfactory performance. In this review, we discuss classical and ab initio MD simulations of water/oxide interfaces. Generally, we are interested in the following questions: How do solid surfaces perturb interfacial water structure? How do interfacial water molecules and adsorbed solutes affect solid surfaces and how do interfacial environments affect solvent and solute behavior? Finally, we discuss progress in the application of neural network potential based MD simulations, which offer a promising future because this approach has already enabled ab initio level accuracy for very large systems and long trajectories.

    This article is categorized under:

    Theoretical and Physical Chemistry > Spectroscopy

    Molecular and Statistical Mechanics > Molecular Interactions

    Structure and Mechanism > Molecular Structures

    more » « less
  2. Abstract

    AQME, automated quantum mechanical environments, is a free and open‐source Python package for the rapid deployment of automated workflows using cheminformatics and quantum chemistry. AQME workflows integrate tasks performed across multiple computational chemistry packages and data formats, preserving all computational protocols, data, and metadata for machine and human users to access and reuse. AQME has a modular structure of independent modules that can be implemented in any sequence, allowing the users to use all or only the desired parts of the program. The code has been developed for researchers with basic familiarity with the Python programming language. The CSEARCH module interfaces to molecular mechanics and semi‐empirical QM (SQM) conformer generation tools (e.g., RDKit and Conformer–Rotamer Ensemble Sampling Tool, CREST) starting from various initial structure formats. The CMIN module enables geometry refinement with SQM and neural network potentials, such as ANI. The QPREP module interfaces with multiple QM programs, such as Gaussian, ORCA, and PySCF. The QCORR module processes QM results, storing structural, energetic, and property data while also enabling automated error handling (i.e., convergence errors, wrong number of imaginary frequencies, isomerization, etc.) and job resubmission. The QDESCP module provides easy access to QM ensemble‐averaged molecular descriptors and computed properties, such as NMR spectra. Overall, AQME provides automated, transparent, and reproducible workflows to produce, analyze and archive computational chemistry results. SMILES inputs can be used, and many aspects of tedious human manipulation can be avoided. Installation and execution on Windows, macOS, and Linux platforms have been tested, and the code has been developed to support access through Jupyter Notebooks, the command line, and job submission (e.g., Slurm) scripts. Examples of pre‐configured workflows are available in various formats, and hands‐on video tutorials illustrate their use.

    This article is categorized under:

    Data Science > Chemoinformatics

    Data Science > Computer Algorithms and Programming

    Software > Quantum Chemistry

    more » « less
  3. Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in model architecture, and this limitation cannot be overcome with larger or more diverse datasets. The outlined challenges are primarily associated with the lack of electronic structure information in surrogate models such as interatomic potentials. Given the fast development of machine learning and computational chemistry methods, we expect some limitations of surrogate models to be addressed in the near future; nevertheless spatial locality assumption will likely remain a limiting factor for their transferability. Here, we suggest focusing on an equally important effort—design of physics-informed models that leverage the domain knowledge and employ machine learning only as a corrective tool. In the context of material science, we will focus on semi-empirical quantum mechanics, using machine learning to predict corrections to the reduced-order Hamiltonian model parameters. The resulting models are broadly applicable, retain the speed of semiempirical chemistry, and frequently achieve accuracy on par with much more expensive ab initio calculations. These early results indicate that future work, in which machine learning and quantum chemistry methods are developed jointly, may provide the best of all worlds for chemistry applications that demand both high accuracy and high numerical efficiency.

    more » « less
  4. Abstract

    Quantum mechanics/molecular mechanics (QM/MM) simulations are a popular approach to study various features of large systems. A common application of QM/MM calculations is in the investigation of reaction mechanisms in condensed‐phase and biological systems. The combination of QM and MM methods to represent a system gives rise to several challenges that need to be addressed. The increase in computational speed has allowed the expanded use of more complicated and accurate methods for both QM and MM simulations. Here, we review some approaches that address several common challenges encountered in QM/MM simulations with advanced polarizable potentials, from methods to account for boundary across covalent bonds and long‐range effects, to polarization and advanced embedding potentials.

    This article is categorized under:

    Electronic Structure Theory > Combined QM/MM Methods

    Molecular and Statistical Mechanics > Molecular Interactions

    Software > Simulation Methods

    more » « less
  5. Abstract

    Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist’s toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields are cheap and scalable, but lack transferability to new systems. Machine learning can be used to achieve the best of both approaches. Here we train a general-purpose neural network potential (ANI-1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology, and chemistry, and billions of times faster than CCSD(T)/CBS calculations.

    more » « less