Machine learning interatomic potentials (IPs) can provide accuracy close to that of first-principles methods, such as density functional theory (DFT), at a fraction of the computational cost. This greatly extends the scope of accurate molecular simulations, providing opportunities for quantitative design of materials and devices on scales hitherto unreachable by DFT methods. However, machine learning IPs have a basic limitation in that they lack a physical model for the phenomena being predicted and therefore have unknown accuracy when extrapolating outside their training set. In this paper, we propose a class of Dropout Uncertainty Neural Network (DUNN) potentials that provide rigorous uncertainty estimates that can be understood from both Bayesian and frequentist statistics perspectives. As an example, we develop a DUNN potential for carbon and show how it can be used to predict uncertainty for static and dynamical properties, including stress and phonon dispersion in graphene. We demonstrate two approaches to propagate uncertainty in the potential energy and atomic forces to predicted properties. In addition, we show that DUNN uncertainty estimates can be used to detect configurations outside the training set, and in some cases, can serve as a predictor for the accuracy of a calculation.
- Award ID(s):
- 1945525
- PAR ID:
- 10503668
- Publisher / Repository:
- Digital Discovery
- Date Published:
- Journal Name:
- Digital Discovery
- Volume:
- 2
- Issue:
- 4
- ISSN:
- 2635-098X
- Page Range / eLocation ID:
- 1058 to 1069
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract Machine-learning potentials are accelerating the development of energy materials, especially in identifying phase diagrams and other thermodynamic properties. In this work, we present a neural network potential based on atom-centered symmetry function descriptors to model the energetics of lithium intercalation into graphite. The potential was trained on a dataset of over 9000 diverse lithium–graphite configurations that varied in applied stress and strain, lithium concentration, lithium–carbon and lithium–lithium bond distances, and stacking order to ensure wide sampling of the potential atomic configurations during intercalation. We calculated the energies of these structures using density functional theory (DFT) through the Bayesian error estimation functional with van der Waals correlation exchange-correlation functional, which can accurately describe the van der Waals interactions that are crucial to determining the thermodynamics of this phase space. Bayesian optimization, as implemented in
Dragonfly , was used to select optimal set of symmetry function parameters, ultimately resulting in a potential with a prediction error of 8.24 meV atom−1on unseen test data. The potential can predict energies, structural properties, and elastic constants at an accuracy comparable to other DFT exchange-correlation functionals at a fraction of the computational cost. The accuracy of the potential is also comparable to similar machine-learned potentials describing other systems. We calculate the open circuit voltage with the calculator and find good agreement with experiment, especially in the regimex ≥ 0.3, forx in Lix C6. This study further illustrates the power of machine learning potentials, which promises to revolutionize design and optimization of battery materials. -
Neural network potentials (NNPs) trained against density functional theory (DFT) are capable of reproducing the potential energy surface at a fraction of the computational cost. However, most NNP implementations focus on energy and forces. In this work, we modified the NNP model introduced by Behler and Parrinello to predict Fermi energy, band edges, and partial density of states of Cu 2 O. Our NNP can reproduce the DFT potential energy surface and properties at a fraction of the computational cost. We used our NNP to perform molecular dynamics (MD) simulations and validated the predicted properties against DFT calculations. Our model achieved a root mean squared error of 16 meV for the energy prediction. Furthermore, we show that the standard deviation of the energies predicted by the ensemble of training snapshots can be used to estimate the uncertainty in the predictions. This allows us to switch from the NNP to DFT on-the-fly during the MD simulation to evaluate the forces when the uncertainty is high.more » « less
-
Machine learning (ML) offers an attractive method for making predictions about molecular systems while circumventing the need to run expensive electronic structure calculations. Once trained on ab initio data, the promise of ML is to deliver accurate predictions of molecular properties that were previously computationally infeasible. In this work, we develop and train a graph neural network model to correct the basis set incompleteness error (BSIE) between a small and large basis set at the RHF and B3LYP levels of theory. Our results show that, when compared to fitting to the total potential, an ML model fitted to correct the BSIE is better at generalizing to systems not seen during training. We test this ability by training on single molecules while evaluating on molecular complexes. We also show that ensemble models yield better behaved potentials in situations where the training data is insufficient. However, even when only fitting to the BSIE, acceptable performance is only achieved when the training data sufficiently resemble the systems one wants to make predictions on. The test error of the final model trained to predict the difference between the cc-pVDZ and cc-pV5Z potential is 0.184 kcal/mol for the B3LYP density functional, and the ensemble model accurately reproduces the large basis set interaction energy curves on the S66x8 dataset.more » « less
-
Abstract Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist’s toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields are cheap and scalable, but lack transferability to new systems. Machine learning can be used to achieve the best of both approaches. Here we train a general-purpose neural network potential (ANI-1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology, and chemistry, and billions of times faster than CCSD(T)/CBS calculations.