skip to main content


Title: Hybrid Neural Network Augmented Physics-based Models for Nonlinear Filtering
In this paper we present a hybrid neural network augmented physics-based modeling (APBM) framework for Bayesian nonlinear latent space estimation. The proposed APBM strategy allows for model adaptation when new operation conditions come into play or the physics-based model is insufficient (or incomplete) to properly describe the latent phenomenon. One advantage of the APBMs and our estimation procedure is the capability of maintaining the physical interpretability of estimated states. Furthermore, we propose a constraint filtering approach to control the neural network contributions to the overall model. We also exploit assumed density filtering techniques and cubature integration rules to present a flexible estimation strategy that can easily deal with nonlinear models and high-dimensional latent spaces. Finally, we demonstrate the efficacy of our methodology by leveraging a target tracking scenario with nonlinear and incomplete measurement and acceleration models, respectively.  more » « less
Award ID(s):
1935337 1845833
NSF-PAR ID:
10357671
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
25th International Conference on Information Fusion (FUSION)
Page Range / eLocation ID:
1 to 6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Traditional linear subspace-based reduced order models (LS-ROMs) can be used to significantly accelerate simulations in which the solution space of the discretized system has a small dimension (with a fast decaying Kolmogorov 𝑛-width). However, LS-ROMs struggle to achieve speed-ups in problems whose solution space has a large dimension, such as highly nonlinear problems whose solutions have large gradients. Such an issue can be alleviated by combining nonlinear model reduction with operator learning. Over the past decade, many nonlinear manifold-based reduced order models (NM-ROM) have been proposed. In particular, NM-ROMs based on deep neural networks (DNN) have received increasing interest. This work takes inspiration from adaptive basis methods and specifically focuses on developing an NM-ROM based on Convolutional Neural Network-based autoencoders (CNNAE) with iteration-dependent trainable kernels. Additionally, we investigate DNN-based and quadratic operator inference strategies between latent spaces. A strategy to perform vectorized implicit time integration is also proposed. We demonstrate that the proposed CNN-based NM-ROM, combined with DNN- based operator inference, generally performs better than commonly employed strategies (in terms of prediction accuracy) on a benchmark advection-dominated problem. The method also presents substantial gain in terms of training speed per epoch, with a training time about one order of magnitude smaller than the one associated with a state-of-the-art technique performing with the same level of accuracy. 
    more » « less
  2. null (Ed.)
    This work studies the model identification problem of a class of post-nonlinear mixture models in the presence of dependent latent components. Particularly, our interest lies in latent components that are nonnegative and sum-to-one. This problem is motivated by applications such as hyperspectral unmixing under nonlinear distortion effects. Many prior works tackled nonlinear mixture analysis using statistical independence among the latent components, which is not applicable in our case. A recent work by Yang et al. put forth a solution for this problem leveraging functional equations. However, the identifiability conditions derived there are somewhat restrictive. The associated implementation also has difficulties-the function approximator used in their work may not be able to represent general nonlinear distortions and the formulated constrained neural network optimization problem may be challenging to handle. In this work, we advance both the theoretical and practical aspects of the problem of interest. On the theory side, we offer a new identifiability condition that circumvents a series of stringent assumptions in Yang et al.'s work. On the algorithm side, we propose an easy-to-implement unconstrained neural network-based algorithm-without sacrificing function approximation capabilities. Numerical experiments are employed to support our design. 
    more » « less
  3. In this work, we present a new approach for latent system dynamics and remaining useful life (RUL) estimation of complex degrading systems using generative modeling and reinforcement learning. The main contributions of the proposed method are two-fold. First, we show how a deep generative model can approximate the functionality of high-fidelity simulators and, thus, is able to substitute expensive and complex physics-based models with data-driven surrogate ones. In other words, we can use the generative model in lieu of the actual system as a surrogate model of the system. Furthermore, we show how to use such surrogate models for predictive analytics. Our method follows two main steps. First, we use a deep variational autoencoder (VAE) to learn the distribution over the latent state-space that characterizes the dynamics of the system under monitoring. After model training, the probabilistic VAE decoder becomes the surrogate system model. Then, we develop a scalable reinforcement learning framework using the decoder as the environment, to train an agent for identifying adequate approximate values of the latent dynamics, as well as the RUL.To our knowledge, the method presented in this paper is the first in industrial prognostics that utilizes generative models and reinforcement learning in that capacity. While the process requires extensive data preprocessing and environment tailored design, which is not always possible, it demonstrates the ability of generative models working in conjunction with reinforcement learning to provide proper value estimations for system dynamics and their RUL. To validate the quality of the proposed method, we conducted numerical experiments using the train_FD002 dataset provided by the NASA CMAPSS data repository. Different subsets were used to train the VAE and the RL agent, and a leftover set was then used for model validation. The results shown prove the merit of our method and will further assist us in developing a data-driven RL environment that incorporates more complex latent dynamic layers, such as normal/faulty operating conditions and hazard processes. 
    more » « less
  4. null (Ed.)
    Systems exhibiting nonlinear dynamics, including but not limited to chaos, are ubiquitous across Earth Sciences such as Meteorology, Hydrology, Climate and Ecology, as well as Biology such as neural and cardiac processes. However, System Identification remains a challenge. In climate and earth systems models, while governing equations follow from first principles and understanding of key processes has steadily improved, the largest uncertainties are often caused by parameterizations such as cloud physics, which in turn have witnessed limited improvements over the last several decades. Climate scientists have pointed to Machine Learning enhanced parameter estimation as a possible solution, with proof-of-concept methodological adaptations being examined on idealized systems. While climate science has been highlighted as a "Big Data" challenge owing to the volume and complexity of archived model-simulations and observations from remote and in-situ sensors, the parameter estimation process is often relatively a "small data" problem. A crucial question for data scientists in this context is the relevance of state-of-the-art data-driven approaches including those based on deep neural networks or kernel-based processes. Here we consider a chaotic system - two-level Lorenz-96 - used as a benchmark model in the climate science literature, adopt a methodology based on Gaussian Processes for parameter estimation and compare the gains in predictive understanding with a suite of Deep Learning and strawman Linear Regression methods. Our results show that adaptations of kernel-based Gaussian Processes can outperform other approaches under small data constraints along with uncertainty quantification; and needs to be considered as a viable approach in climate science and earth system modeling. 
    more » « less
  5. Abstract

    There has been significant work recently in developing machine learning (ML) models in high energy physics (HEP) for tasks such as classification, simulation, and anomaly detection. Often these models are adapted from those designed for datasets in computer vision or natural language processing, which lack inductive biases suited to HEP data, such as equivariance to its inherent symmetries. Such biases have been shown to make models more performant and interpretable, and reduce the amount of training data needed. To that end, we develop the Lorentz group autoencoder (LGAE), an autoencoder model equivariant with respect to the proper, orthochronous Lorentz group$$\textrm{SO}^+(3,1)$$SO+(3,1), with a latent space living in the representations of the group. We present our architecture and several experimental results on jets at the LHC and find it outperforms graph and convolutional neural network baseline models on several compression, reconstruction, and anomaly detection metrics. We also demonstrate the advantage of such an equivariant model in analyzing the latent space of the autoencoder, which can improve the explainability of potential anomalies discovered by such ML models.

     
    more » « less