skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations
Recent studies illustrate how machine learning (ML) can be used to bypass a core challenge of molecular modeling: the trade-off between accuracy and computational cost. Here, we assess multiple ML approaches for predicting the atomization energy of organic molecules. Our resulting models learn the difference between low-fidelity, B3LYP, and high-accuracy, G4MP2, atomization energies and predict the G4MP2 atomization energy to 0.005 eV (mean absolute error) for molecules with less than nine heavy atoms (training set of 117,232 entries, test set 13,026) and 0.012 eV for a small set of 66 molecules with between 10 and 14 heavy atoms. Our two best models, which have different accuracy/speed trade-offs, enable the efficient prediction of G4MP2-level energies for large molecules and are available through a simple web interface.  more » « less
Award ID(s):
1636950
PAR ID:
10134744
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
MRS Communications
Volume:
9
Issue:
3
ISSN:
2159-6859
Page Range / eLocation ID:
891 to 899
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract A catalytic surface should be stable under reaction conditions to be effective. However, it takes significant effort to screen many surfaces for their stability, as this requires intensive quantum chemical calculations. To more efficiently estimate stability, we provide a general and data-efficient machine learning (ML) approach to accurately and efficiently predict the surface energies of metal alloy surfaces. Our ML approach introduces an element-centered fingerprint (ECFP) which was used as a vector representation for fitting models for predicting surface formation energies. The ECFP is significantly more accurate than several existing feature sets when applied to dilute alloy surfaces and is competitive with existing feature sets when applied to bulk alloy surfaces or gas-phase molecules. Models using the ECFP as input can be quite general, as we created models with good accuracy over a broad set of bimetallic surfaces including most d-block metals, even with relatively small datasets. For example, using the ECFP, we developed a kernel ridge regression ML model which is able to predict the surface energies of alloys of diverse metal combinations with a mean absolute error of 0.017 eV atom−1. Combining this model with an existing model for predicting adsorption energies, we estimated segregation trends of 596 single-atom alloys (SAAs)with and without CO adsorbed on these surfaces. As a simple test of the approach, we identify specific cases where CO does not induce segregation in these SAAs. 
    more » « less
  2. We applied localized orbital scaling correction (LOSC) in Bethe–Salpeter equation (BSE) to predict accurate excitation energies for molecules. LOSC systematically eliminates the delocalization error in the density functional approximation and is capable of approximating quasiparticle (QP) energies with accuracy similar to or better than GW Green’s function approach and with much less computational cost. The QP energies from LOSC, instead of commonly used G 0 W 0 and ev GW, are directly used in BSE. We show that the BSE/LOSC approach greatly outperforms the commonly used BSE/ G 0 W 0 approach for predicting excitations with different characters. For the calculations of Truhlar–Gagliardi test set containing valence, charge transfer, and Rydberg excitations, BSE/LOSC with the Tamm–Dancoff approximation provides a comparable accuracy to time-dependent density functional theory (TDDFT) and BSE/ev GW. For the calculations of Stein CT test set and Rydberg excitations of atoms, BSE/LOSC considerably outperforms both BSE/ G 0 W 0 and TDDFT approaches with a reduced starting point dependence. BSE/LOSC is, thus, a promising and efficient approach to calculate excitation energies for molecular systems. 
    more » « less
  3. Abstract The earlier integration of validated Lennard–Jones (LJ) potentials for 8 fcc metals into materials and biomolecular force fields has advanced multiple research fields, for example, metal–electrolyte interfaces, recognition of biomolecules, colloidal assembly of metal nanostructures, alloys, and catalysis. Here we introduce 12-6 and 9-6 LJ parameters for classical all-atom simulations of 10 further fcc metals (Ac, Ca (α), Ce (γ), Es (β), Fe (γ), Ir, Rh, Sr (α), Th (α), Yb (β)) and stainless steel. The parameters reproduce lattice constants, surface energies, water interfacial energies, and interactions with (bio)organic molecules in 0.1 to 5% agreement with experiment, as well as qualitative mechanical properties under standard conditions. Deviations are reduced up to a factor of one hundred in comparison to earlier Lennard–Jones parameters, embedded atom models, and density functional theory. We also explain a quantitative correlation between atomization energies from experiments and surface energies that supports parameter development. The models are computationally very efficient and applicable to an exponential space of alloys. Compatibility with a wide range of force fields such as the Interface force field (IFF), AMBER, CHARMM, COMPASS, CVFF, DREIDING, OPLS-AA, and PCFF enables reliable simulations of nanostructures up to millions of atoms and microsecond time scales. User-friendly model building and input generation are available in the CHARMM-GUI Nanomaterial Modeler. As a limitation, deviations in mechanical properties vary and are comparable to DFT methods. We discuss the incorporation of reactivity and features of the electronic structure to expand the range of applications and further increase the accuracy. 
    more » « less
  4. Abstract Brønsted‐Evans‐Polanyi (BEP) relationships, i. e., a linear scaling between reaction and activation energies, lie at the core of computational design of heterogeneous catalysts. However, BEPs are not general and often require reparameterization for each class of reactions. Here we construct generalized BEPs (gBEPs), which can predict activation energies for a diverse dataset of reactions of C, O, N and H containing molecules on metal surfaces. In a first step we develop a set of descriptors based on scaling relationships that can capture the change in chemical identity of reactants during the reaction. Subsequently, we use the reaction energy, these descriptors and a single descriptor for the surface structure to parameterize machine learning based regression approaches for the prediction of activation energies. The best approach we developed shows a Mean Absolute Error (MAE) of 0.11 eV for the training set (80 % of the data set) and 0.23 eV for the test set (20 % of the data set). The methodology presented here allows to calculate activation energies within fractions of seconds on a typical personal computer and due to its generality, accuracy and simplicity in application it might prove to be useful in transition metal catalyst design. 
    more » « less
  5. Sn clusters have been grown on highly oriented pyrolytic graphite (HOPG) surfaces and investigated by scanning tunneling microscopy (STM), X-ray photoelectron spectroscopy (XPS), and density functional theory (DFT) calculations. At low Sn coverages ranging from 0.02-0.25 ML, Sn grows as small clusters that nucleate uniformly on the terraces. This behavior is in contrast with the growth of transition metals such as Pd, Pt, and Re on HOPG, given that these metals form large clusters with preferential nucleation for Pd and Pt at the favored low-coordination step edges. XPS experiments show no evidence of Sn-HOPG interactions, and the activation energy barrier for diffusion calculated for Sn on HOPG (0.06 eV) is lower or comparable to those of Pd, Pt and Re (0.04, 0.22, and 0.61 eV, respectively), indicating that the growth of the Sn clusters is not kinetically limited by diffusion on the surface. DFT calculations of the binding energy/atom as a function of cluster size demonstrate that the energies of the Sn clusters on HOPG are similar to that of Sn atoms in the bulk for Sn clusters larger than 10 atoms, whereas the Pt, Pd, and Re clusters on HOPG have energies that are 1-2 eV higher than in the bulk. Thus, there is no thermodynamic driving force for Sn atoms to form clusters larger than 10 atoms on HOPG, unlike for Pd, Pt, and Re atoms, which minimize their energy by aggregating into larger, more bulk-like clusters. In addition, annealing the Sn/HOPG clusters to 800 K and 950 K does not increase the cluster size but instead removes the larger clusters, while Sn deposition at 810 K induces the appearance of protrusions that are believed to be from subsurface Sn. DFT studies indicate that it is energetically favorable for a Sn atom to exist in the subsurface layer only when the Sn atom is located at a subsurface vacancy. 
    more » « less