skip to main content


This content will become publicly available on December 1, 2024

Title: Machine learning electronic structure methods based on the one-electron reduced density matrix
Abstract

The theorems of density functional theory (DFT) establish bijective maps between the local external potential of a many-body system and its electron density, wavefunction and, therefore, one-particle reduced density matrix. Building on this foundation, we show that machine learning models based on the one-electron reduced density matrix can be used to generate surrogate electronic structure methods. We generate surrogates of local and hybrid DFT, Hartree-Fock and full configuration interaction theories for systems ranging from small molecules such as water to more complex compounds like benzene and propanol. The surrogate models use the one-electron reduced density matrix as the central quantity to be learned. From the predicted density matrices, we show that either standard quantum chemistry or a second machine-learning model can be used to compute molecular observables, energies, and atomic forces. The surrogate models can generate essentially anything that a standard electronic structure method can, ranging from band gaps and Kohn-Sham orbitals to energy-conserving ab-initio molecular dynamics simulations and infrared spectra, which account for anharmonicity and thermal effects, without the need to employ computationally expensive algorithms such as self-consistent field theory. The algorithms are packaged in an efficient and easy to use Python code, QMLearn, accessible on popular platforms.

 
more » « less
Award ID(s):
2117429
NSF-PAR ID:
10471660
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
14
Issue:
1
ISSN:
2041-1723
Page Range / eLocation ID:
1-9
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Supervised machine learning approaches have been increasingly used in accelerating electronic structure prediction as surrogates of first-principle computational methods, such as density functional theory (DFT). While numerous quantum chemistry datasets focus on chemical properties and atomic forces, the ability to achieve accurate and efficient prediction of the Hamiltonian matrix is highly desired, as it is the most important and fundamental physical quantity that determines the quantum states of physical systems and chemical properties. In this work, we generate a new Quantum Hamiltonian dataset, named as QH9, to provide precise Hamiltonian matrices for 2,399 molecular dynamics trajectories and 130,831 stable molecular geometries, based on the QM9 dataset. By designing benchmark tasks with various molecules, we show that current machine learning models have the capacity to predict Hamiltonian matrices for arbitrary molecules. Both the QH9 dataset and the baseline models are provided to the community through an open-source benchmark, which can be highly valuable for developing machine learning methods and accelerating molecular and materials design for scientific and technological applications. Our benchmark is publicly available at \url{https://github.com/divelab/AIRS/tree/main/OpenDFT/QHBench}. 
    more » « less
  2. Supervised machine learning approaches have been increasingly used in accelerating electronic structure prediction as surrogates of first-principle computational methods, such as density functional theory (DFT). While numerous quantum chemistry datasets focus on chemical properties and atomic forces, the ability to achieve accurate and efficient prediction of the Hamiltonian matrix is highly desired, as it is the most important and fundamental physical quantity that determines the quantum states of physical systems and chemical properties. In this work, we generate a new Quantum Hamiltonian dataset, named as QH9, to provide precise Hamiltonian matrices for 2,399 molecular dynamics trajectories and 130,831 stable molecular geometries, based on the QM9 dataset. By designing benchmark tasks with various molecules, we show that current machine learning models have the capacity to predict Hamiltonian matrices for arbitrary molecules. Both the QH9 dataset and the baseline models are provided to the community through an open-source benchmark, which can be highly valuable for developing machine learning methods and accelerating molecular and materials design for scientific and technological applications. 
    more » « less
  3. Accelerating the development of π-conjugated molecules for applications such as energy generation and storage, catalysis, sensing, pharmaceuticals, and (semi)conducting technologies requires rapid and accurate evaluation of the electronic, redox, or optical properties. While high-throughput computational screening has proven to be a tremendous aid in this regard, machine learning (ML) and other data-driven methods can further enable orders of magnitude reduction in time while at the same time providing dramatic increases in the chemical space that is explored. However, the lack of benchmark datasets containing the electronic, redox, and optical properties that characterize the diverse, known chemical space of organic π-conjugated molecules limits ML model development. Here, we present a curated dataset containing 25k molecules with density functional theory (DFT) and time-dependent DFT (TDDFT) evaluated properties that include frontier molecular orbitals, ionization energies, relaxation energies, and low-lying optical excitation energies. Using the dataset, we train a hierarchy of ML models, ranging from classical models such as ridge regression to sophisticated graph neural networks, with molecular SMILES representation as input. We observe that graph neural networks augmented with contextual information allow for significantly better predictions across a wide array of properties. Our best-performing models also provide an uncertainty quantification for the predictions. To democratize access to the data and trained models, an interactive web platform has been developed and deployed. 
    more » « less
  4. Abstract

    In the modeling of spin‐crossing reactions, it has become popular to directly explore the spin‐adiabatic surfaces. Specifically, through constructing spin‐adiabatic states from a two‐state Hamiltonian (with spin‐orbit coupling matrix elements) at each geometry, one can readily employ advanced geometry optimization algorithms to acquire a “transition state” structure, where the spin crossing occurs. In this work, we report the implementation of a fully‐variational spin‐adiabatic approach based on Kohn‐Sham density functional theory spin states (sharing the same set of molecular orbitals) and the Breit‐Pauli one‐electron spin‐orbit operator. For three model spin‐crossing reactions (predissociation of N2O, singlet‐triplet conversion in CH2, and CO addition to Fe(CO)4), the spin‐crossing points were obtained. Our results also indicated the Breit‐Pauli one‐electron spin‐orbit coupling can vary significantly along the reaction pathway on the spin‐adiabatic energy surface. On the other hand, due to the restriction that low‐spin and high‐spin states share the same set of molecular orbitals, the acquired spin‐adiabatic energy surface shows a cusp (ie, a first‐order discontinuity) at the crossing point, which prevents the use of standard geometry optimization algorithms to pinpoint the crossing point. An extension with this restriction removed is being developed to achieve the smoothness of spin‐adiabatic surfaces.

     
    more » « less
  5. Abstract

    The Hohenberg-Kohn theorem of density-functional theory establishes the existence of a bijection between the ground-state electron density and the external potential of a many-body system. This guarantees a one-to-one map from the electron density to all observables of interest including electronic excited-state energies. Time-Dependent Density-Functional Theory (TDDFT) provides one framework to resolve this map; however, the approximations inherent in practical TDDFT calculations, together with their computational expense, motivate finding a cheaper, more direct map for electronic excitations. Here, we show that determining density and energy functionals via machine learning allows the equations of TDDFT to be bypassed. The framework we introduce is used to perform the first excited-state molecular dynamics simulations with a machine-learned functional on malonaldehyde and correctly capture the kinetics of its excited-state intramolecular proton transfer, allowing insight into how mechanical constraints can be used to control the proton transfer reaction in this molecule. This development opens the door to using machine-learned functionals for highly efficient excited-state dynamics simulations.

     
    more » « less