skip to main content

Title: Kohn-Sham equations as regularizer: building prior knowledge into machine-learned physics
Including prior knowledge is important for effective machine learning models in physics, and is usually achieved by explicitly adding loss terms or constraints on model architectures. Prior knowledge embedded in the physics computation itself rarely draws attention. We show that solving the Kohn-Sham equations when training neural networks for the exchange-correlation functional provides an implicit regularization that greatly improves generalization. Two separations suffice for learning the entire one-dimensional H2 dissociation curve within chemical accuracy, including the strongly correlated region. Our models also generalize to unseen types of molecules and overcome self-interaction error.
Authors:
; ; ; ; ; ;
Award ID(s):
1856165
Publication Date:
NSF-PAR ID:
10220447
Journal Name:
Physical review letters
Issue:
126
ISSN:
1079-7114
Sponsoring Org:
National Science Foundation
More Like this
  1. Firoozi, R. ; Mehr, N. ; Yel, E. ; Antonova, R. ; Bohg, J. ; Schwager, M. ; Kochenderfer, M. (Ed.)
    Effective inclusion of physics-based knowledge into deep neural network models of dynamical sys- tems can greatly improve data efficiency and generalization. Such a priori knowledge might arise from physical principles (e.g., conservation laws) or from the system’s design (e.g., the Jacobian matrix of a robot), even if large portions of the system dynamics remain unknown. We develop a framework to learn dynamics models from trajectory data while incorporating a priori system knowledge as inductive bias. More specifically, the proposed framework uses physics-based side information to inform the structure of the neural network itself, and to place constraints on the values of the outputs and the internal states of the model. It represents the system’s vector field as a composition of known and unknown functions, the latter of which are parametrized by neural networks. The physics-informed constraints are enforced via the augmented Lagrangian method during the model’s training. We experimentally demonstrate the benefits of the proposed approach on a variety of dynamical systems – including a benchmark suite of robotics environments featur- ing large state spaces, non-linear dynamics, external forces, contact forces, and control inputs. By exploiting a priori system knowledge during training, the proposed approach learns to predict the systemmore »dynamics two orders of magnitude more accurately than a baseline approach that does not include prior knowledge, given the same training dataset.« less
  2. Bayesian Knowledge Tracing (BKT) is a commonly used approach for student modeling, and Long Short Term Memory (LSTM) is a versatile model that can be applied to a wide range of tasks, such as language translation. In this work, we directly compared three models: BKT, its variant Intervention-BKT (IBKT), and LSTM, on two types of student modeling tasks: post-test scores prediction and learning gains prediction. Additionally, while previous work on student learning has often used skill/knowledge components identified by domain experts, we incorporated an automatic skill discovery method (SK), which includes a nonparametric prior over the exercise-skill assignments, to all three models. Thus, we explored a total of six models: BKT, BKT+SK, IBKT, IBKT+SK, LSTM, and LSTM+SK. Two training datasets were employed, one was collected from a natural language physics intelligent tutoring system named Cordillera, and the other was from a standard probability intelligent tutoring system named Pyrenees. Overall, our results showed that BKT and BKT+SK outperformed the others on predicting post-test scores, whereas LSTM and LSTM+SK achieved the highest accuracy, F1-measure, and area under the ROC curve (AUC) on predicting learning gains. Furthermore, we demonstrated that by combining SK with the BKT model, BKT+SK could reliably predict post-test scoresmore »using only the earliest 50% of the entire training sequences. For learning gain early prediction, using the earliest 70% of the entire sequences, LSTM can deliver a comparable prediction as using the entire training sequences. The findings yield a learning environment that can foretell students’ performance and learning gains early, and can render adaptive pedagogical strategy accordingly.« less
  3. A bstract Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.
  4. Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.
  5. Machine learning is increasingly recognized as a promising technology in the biological, biomedical, and behavioral sciences. There can be no argument that this technique is incredibly successful in image recognition with immediate applications in diagnostics including electrophysiology, radiology, or pathology, where we have access to massive amounts of annotated data. However, machine learning often performs poorly in prognosis, especially when dealing with sparse data. This is a field where classical physics-based simulation seems to remain irreplaceable. In this review, we identify areas in the biomedical sciences where machine learning and multiscale modeling can mutually benefit from one another: Machine learning can integrate physics-based knowledge in the form of governing equations, boundary conditions, or constraints to manage ill-posted problems and robustly handle sparse and noisy data; multiscale modeling can integrate machine learning to create surrogate models, identify system dynamics and parameters, analyze sensitivities, and quantify uncertainty to bridge the scales and understand the emergence of function. With a view towards applications in the life sciences, we discuss the state of the art of combining machine learning and multiscale modeling, identify applications and opportunities, raise open questions, and address potential challenges and limitations. This review serves as introduction to a special issuemore »on Uncertainty Quantification, Machine Learning, and Data-Driven Modeling of Biological Systems that will help identify current roadblocks and areas where computational mechanics, as a discipline, can play a significant role. We anticipate that it will stimulate discussion within the community of computational mechanics and reach out to other disciplines including mathematics, statistics, computer science, artificial intelligence, biomedicine, systems biology, and precision medicine to join forces towards creating robust and efficient models for biological systems.« less