skip to main content


Search for: All records

Award ID contains: 2117429

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The theorems of density functional theory (DFT) establish bijective maps between the local external potential of a many-body system and its electron density, wavefunction and, therefore, one-particle reduced density matrix. Building on this foundation, we show that machine learning models based on the one-electron reduced density matrix can be used to generate surrogate electronic structure methods. We generate surrogates of local and hybrid DFT, Hartree-Fock and full configuration interaction theories for systems ranging from small molecules such as water to more complex compounds like benzene and propanol. The surrogate models use the one-electron reduced density matrix as the central quantity to be learned. From the predicted density matrices, we show that either standard quantum chemistry or a second machine-learning model can be used to compute molecular observables, energies, and atomic forces. The surrogate models can generate essentially anything that a standard electronic structure method can, ranging from band gaps and Kohn-Sham orbitals to energy-conserving ab-initio molecular dynamics simulations and infrared spectra, which account for anharmonicity and thermal effects, without the need to employ computationally expensive algorithms such as self-consistent field theory. The algorithms are packaged in an efficient and easy to use Python code, QMLearn, accessible on popular platforms.

     
    more » « less
    Free, publicly-accessible full text available December 1, 2024
  2. For an electronic system, given a mean field method and a distribution of orbital occupation numbers that are close to the natural occupations of the correlated system, we provide formal evidence and computational support to the hypothesis that the entropy (or more precisely −σS, where σ is a parameter and S is the entropy) of such a distribution is a good approximation to the correlation energy. Underpinning the formal evidence are mild assumptions: the correlation energy is strictly a functional of the occupation numbers, and the occupation numbers derive from an invertible distribution. Computational support centers around employing different mean field methods and occupation number distributions (Fermi–Dirac, Gaussian, and linear), for which our claims are verified for a series of pilot calculations involving bond breaking and chemical reactions. This work establishes a formal footing for those methods employing entropy as a measure of electronic correlation energy (e.g., i-DMFT [Wang and Baerends, Phys. Rev. Lett. 128, 013001 (2022)] and TAO-DFT [J.-D. Chai, J. Chem. Phys. 136, 154104 (2012)]) and sets the stage for the widespread use of entropy functionals for approximating the (static) electronic correlation.

     
    more » « less
    Free, publicly-accessible full text available November 21, 2024
  3. Free, publicly-accessible full text available November 8, 2024
  4. Free, publicly-accessible full text available June 22, 2024
  5. Obtaining solutions to optimal transportation (OT) problems is typically intractable when marginal spaces are continuous. Recent research has focused on approximating continuous solutions with discretization methods based on i.i.d. sampling, and this has shown convergence as the sample size increases. However, obtaining OT solutions with large sample sizes requires intensive computation effort, which can be prohibitive in practice. In this paper, we propose an algorithm for calculating discretizations with a given number of weighted points for marginal distributions by minimizing the (entropy-regularized) Wasserstein distance and providing bounds on the performance. The results suggest that our plans are comparable to those obtained with much larger numbers of i.i.d. samples and are more efficient than existing alternatives. Moreover, we propose a local, parallelizable version of such discretizations for applications, which we demonstrate by approximating adorable images.

     
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  6. null (Ed.)
    Abstract State-of-the-art deep-learning systems use decision rules that are challenging for humans to model. Explainable AI (XAI) attempts to improve human understanding but rarely accounts for how people typically reason about unfamiliar agents. We propose explicitly modelling the human explainee via Bayesian teaching, which evaluates explanations by how much they shift explainees’ inferences toward a desired goal. We assess Bayesian teaching in a binary image classification task across a variety of contexts. Absent intervention, participants predict that the AI’s classifications will match their own, but explanations generated by Bayesian teaching improve their ability to predict the AI’s judgements by moving them away from this prior belief. Bayesian teaching further allows each case to be broken down into sub-examples (here saliency maps). These sub-examples complement whole examples by improving error detection for familiar categories, whereas whole examples help predict correct AI judgements of unfamiliar cases. 
    more » « less
  7. null (Ed.)
    It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success as a deep network used for feature extraction. Then, a GP was used as the function model. Recently, it was suggested that, albeit training with marginal likelihood, the deterministic nature of a feature extractor might lead to overfitting, and replacement with a Bayesian network seemed to cure it. Here, we propose the conditional deep Gaussian process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. It follows our previous moment matching approach to approximate the marginal prior for conditional DGP with a GP carrying an effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by optimizing the approximate marginal likelihood which implicitly depends on the hyperdata via the kernel. We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space. However, the conditional DGP and the corresponding approximate inference enjoy the benefit of being more Bayesian than deep kernel learning. Preliminary extrapolation results demonstrate expressive power from the depth of hierarchy by exploiting the exact covariance and hyperdata learning, in comparison with GP kernel composition, DGP variational inference and deep kernel learning. We also address the non-Gaussian aspect of our model as well as way of upgrading to a full Bayes inference. 
    more » « less
  8. Pham, Tien ; Solomon, Latasha ; Hohil, Myron E. (Ed.)
  9. null (Ed.)
  10. null (Ed.)