skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Physics-Informed Deep Learning for Traffic State Estimation: A Survey and the Outlook
For its robust predictive power (compared to pure physics-based models) and sample-efficient training (compared to pure deep learning models), physics-informed deep learning (PIDL), a paradigm hybridizing physics-based models and deep neural networks (DNNs), has been booming in science and engineering fields. One key challenge of applying PIDL to various domains and problems lies in the design of a computational graph that integrates physics and DNNs. In other words, how the physics is encoded into DNNs and how the physics and data components are represented. In this paper, we offer an overview of a variety of architecture designs of PIDL computational graphs and how these structures are customized to traffic state estimation (TSE), a central problem in transportation engineering. When observation data, problem type, and goal vary, we demonstrate potential architectures of PIDL computational graphs and compare these variants using the same real-world dataset.  more » « less
Award ID(s):
2038984
PAR ID:
10447567
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Algorithms
Volume:
16
Issue:
6
ISSN:
1999-4893
Page Range / eLocation ID:
305
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups. 
    more » « less
  2. Pre-trained deep neural networks (DNNs) are being widely deployed by industry for making business decisions and to serve users; however, a major problem is model decay, where the DNN's predictions become more erroneous over time, resulting in revenue loss or unhappy users. To mitigate model decay, DNNs are retrained from scratch using old and new data. This is computationally expensive, so retraining happens only once performance significantly decreases. Here, we study how continual learning (CL) could potentially overcome model decay in large pre-trained DNNs and greatly reduce computational costs for keeping DNNs up-to-date. We identify the "stability gap" as a major obstacle in our setting. The stability gap refers to a phenomenon where learning new data causes large drops in performance for past tasks before CL mitigation methods eventually compensate for this drop. We test two hypotheses to investigate the factors influencing the stability gap and identify a method that vastly reduces this gap. In large-scale experiments for both easy and hard CL distributions (e.g., class incremental learning), we demonstrate that our method reduces the stability gap and greatly increases computational efficiency. Our work aligns CL with the goals of the production setting, where CL is needed for many applications. 
    more » « less
  3. Topological data analysis (TDA) is a branch of computational mathematics, bridging algebraic topology and data science, that provides compact, noise-robust representations of complex structures. Deep neural networks (DNNs) learn millions of parameters associated with a series of transformations defined by the model architecture resulting in high-dimensional, difficult to interpret internal representations of input data. As DNNs become more ubiquitous across multiple sectors of our society, there is increasing recognition that mathematical methods are needed to aid analysts, researchers, and practitioners in understanding and interpreting how these models' internal representations relate to the final classification. In this paper we apply cutting edge techniques from TDA with the goal of gaining insight towards interpretability of convolutional neural networks used for image classification. We use two common TDA approaches to explore several methods for modeling hidden layer activations as high-dimensional point clouds, and provide experimental evidence that these point clouds capture valuable structural information about the model's process. First, we demonstrate that a distance metric based on persistent homology can be used to quantify meaningful differences between layers and discuss these distances in the broader context of existing representational similarity metrics for neural network interpretability. Second, we show that a mapper graph can provide semantic insight as to how these models organize hierarchical class knowledge at each layer. These observations demonstrate that TDA is a useful tool to help deep learning practitioners unlock the hidden structures of their models. 
    more » « less
  4. Optimal Power Flow (OPF) is a challenging problem in power systems, and recent research has explored the use of Deep Neural Networks (DNNs) to approximate OPF solutions with reduced computational times. While these approaches show promising accuracy and efficiency, there is a lack of analysis of their robustness. This paper addresses this gap by investigating the factors that lead to both successful and suboptimal predictions in DNN-based OPF solvers. It identifies power system features and DNN characteristics that contribute to higher prediction errors and offers insights on mitigating these challenges when designing deep learning models for OPF. 
    more » « less
  5. null (Ed.)
    Unsupervised anomaly detection plays a crucial role in many critical applications. Driven by the success of deep learning, recent years have witnessed growing interests in applying deep neural networks (DNNs) to anomaly detection problems. A common approach is using autoencoders to learn a feature representation for the normal observations in the data. The reconstruction error of the autoencoder is then used as outlier scores to detect the anomalies. However, due to the high complexity brought upon by the over-parameterization of DNNs, the reconstruction error of the anomalies could also be small, which hampers the effectiveness of these methods. To alleviate this problem, we propose a robust framework using collaborative autoencoders to jointly identify normal observations from the data while learning its feature representation. We investigate the theoretical properties of the framework and empirically show its outstanding performance as compared to other DNN-based methods. Our experimental results also show the resiliency of the framework to missing values compared to other baseline methods. 
    more » « less