skip to main content


Title: Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems
There is a growing consensus that solutions to complex science and engineering problems require novel methodologies that are able to integrate traditional physics-based modeling approaches with state-of-the-art machine learning (ML) techniques. This paper provides a structured overview of such techniques. Application-centric objective areas for which these approaches have been applied are summarized, and then classes of methodologies used to construct physics-guided ML models and hybrid physics-ML frameworks are described. We then provide a taxonomy of these existing techniques, which uncovers knowledge gaps and potential crossovers of methods between disciplines that can serve as ideas for future research.  more » « less
Award ID(s):
1934721
NSF-PAR ID:
10346139
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ACM Computing Surveys
ISSN:
0360-0300
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Easy access to audio-visual content on social media, combined with the availability of modern tools such as Tensorflow or Keras, and open-source trained models, along with economical computing infrastructure, and the rapid evolution of deep-learning (DL) methods have heralded a new and frightening trend. Particularly, the advent of easily available and ready to use Generative Adversarial Networks (GANs), have made it possible to generate deepfakes media partially or completely fabricated with the intent to deceive to disseminate disinformation and revenge porn, to perpetrate financial frauds and other hoaxes, and to disrupt government functioning. Existing surveys have mainly focused on the detection of deepfake images and videos; this paper provides a comprehensive review and detailed analysis of existing tools and machine learning (ML) based approaches for deepfake generation, and the methodologies used to detect such manipulations in both audio and video. For each category of deepfake, we discuss information related to manipulation approaches, current public datasets, and key standards for the evaluation of the performance of deepfake detection techniques, along with their results. Additionally, we also discuss open challenges and enumerate future directions to guide researchers on issues which need to be considered in order to improve the domains of both deepfake generation and detection. This work is expected to assist readers in understanding how deepfakes are created and detected, along with their current limitations and where future research may lead. 
    more » « less
  2. Abstract

    Machine learning (ML) techniques have become increasingly important in seismology and earthquake science. Lab‐based studies have used acoustic emission data to predict time‐to‐failure and stress state, and in a few cases, the same approach has been used for field data. However, the underlying physical mechanisms that allow lab earthquake prediction and seismic forecasting remain poorly resolved. Here, we address this knowledge gap by coupling active‐source seismic data, which probe asperity‐scale processes, with ML methods. We show that elastic waves passing through the lab fault zone contain information that can predict the full spectrum of labquakes from slow slip instabilities to highly aperiodic events. The ML methods utilize systematic changes in P‐wave amplitude and velocity to accurately predict the timing and shear stress during labquakes. The ML predictions improve in accuracy closer to fault failure, demonstrating that the predictive power of the ultrasonic signals improves as the fault approaches failure. Our results demonstrate that the relationship between the ultrasonic parameters and fault slip rate, and in turn, the systematically evolving real area of contact and asperity stiffness allow the gradient boosting algorithm to “learn” about the state of the fault and its proximity to failure. Broadly, our results demonstrate the utility of physics‐informed ML in forecasting the imminence of fault slip at the laboratory scale, which may have important implications for earthquake mechanics in nature.

     
    more » « less
  3. INTRODUCTION Solving quantum many-body problems, such as finding ground states of quantum systems, has far-reaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, fault-tolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum many-body physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediate-size experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform non-ML algorithms in challenging quantum many-body problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict ground-state properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexity-theoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum many-body problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; ground-state properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomial-time (NP)–complete problems cannot be solved in randomized polynomial time, we prove that no polynomial-time classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, two-dimensional random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum many-body problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of near-term quantum platforms. Classical algorithms for quantum many-body problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task. 
    more » « less
  4. Machine learning (ML) is currently being investigated as an emerging technique to automate quality of transmission (QoT) estimation during lightpath deployment procedures in optical networks. Even though the potential network-resource savings enabled by ML-based QoT estimation has been confirmed in several studies, some practical limitations hinder its adoption in operational network deployments. Among these, the lack of a comprehensive training dataset is recognized as a main limiting factor, especially in the early network deployment phase. In this study, we compare the performance of two ML methodologies explicitly designed to augment small-sized training datasets, namely, active learning (AL) and domain adaptation (DA), for the estimation of the signal-to-noise ratio (SNR) of an unestablished lightpath. This comparison also allows us to provide some guidelines for the adoption of these two techniques at different life stages of a newly deployed optical network infrastructure. Results show that both AL and DA permit us, starting from limited datasets, to reach a QoT estimation capability similar to that achieved by standard supervised learning approaches working on much larger datasets. More specifically, we observe that a few dozen additional samples acquired from selected probe lightpaths already provide significant performance improvement for AL, whereas a few hundred samples gathered from an external network topology are needed in the case of DA.

     
    more » « less
  5. null (Ed.)
    Glasses have been an integral part of human life for more than 2000 years. Despite several years of research and analysis, some fundamental and practical questions on glasses still remain unanswered. While most of the earlier approaches were based on (i) expert knowledge and intuition, (ii) Edisonian trial and error, or (iii) physics-driven modeling and analysis, recent studies suggest that data-driven techniques, such as artificial intelligence (AI) and machine learning (ML), can provide fresh perspectives to tackle some of these questions. In this article, we identify 21 grand challenges in glass science, the solutions of which are either enabling AI and ML or enabled by AI and ML to accelerate the field of glass science. The challenges presented here range from fundamental questions related to glass formation and composition–processing–property relationships to industrial problems such as automated flaw detection in glass manufacturing. We believe that the present article will instill enthusiasm among the readers to explore some of the grand challenges outlined here and to discover many more challenges that can advance the field of glass science, engineering, and technology. 
    more » « less