skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Epistemic Uncertainty Quantification in State-space LPV Model Identification Using Bayesian Neural Networks
This paper presents a variational Bayesian inference Neural Network (BNN) approach to quantify uncertainties in matrix function estimation for the state-space linear parameter-varying (LPV) model identification problem using only inputs/outputs data. The proposed method simultaneously estimates states and posteriors of matrix functions given data. In particular, states are estimated by reaching a consensus between an estimator based on past system trajectory and an estimator by recurrent equations of states; posteriors are approximated by minimizing the Kullback–Leibler (KL) divergence between the parameterized posterior distribution and the true posterior of the LPV model parameters. Furthermore, techniques such as transfer learning are explored in this work to reduce computational cost and prevent convergence failure of Bayesian inference. The proposed data-driven method is validated using experimental data for identification of a control-oriented reactivity controlled compression ignition (RCCI) engine model.  more » « less
Award ID(s):
1762520
PAR ID:
10188473
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE control systems letters
ISSN:
2475-1456
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents an integrated structure of artificial neural networks, named state integrated matrix estimation (SIME), for linear parameter-varying (LPV) model identification. The proposed method simultaneously estimates states and explores structural dependency of matrix functions of a representative LPV model only using inputs/outputs data. The case with unknown (unmeasurable) states is circumvented by SIME using two estimators of the same state: one estimator represented by an ANN and the other obtained by LPV model equations. Minimizing the difference between these two estimators, as part of the cost function, is used to guarantee their consistency. The results from a complex nonlinear system, namely a reactivity controlled compression ignition (RCCI) engine, show high accuracy of the state-space LPV models obtained using the proposed SIME while requiring minimal hyperparameters tuning. 
    more » « less
  2. This paper presents an integrated structure of arti cial neural networks, named state integrated matrix estimation (SIME), for linear parameter-varying (LPV) model identi cation. The proposed method simultaneously estimates states and explores structural dependency of matrix functions of a representative LPV model only using inputs/outputs data. The case with unknown (unmeasurable) states is circumvented by SIME using two estimators of the same state: one estimator represented by an ANN and the other obtained by LPV model equations. Minimizing the difference between these two estimators, as part of the cost function, is used to guarantee their consistency. The results from a complex nonlinear system, namely a reactivity controlled compression ignition (RCCI) engine, show high accuracy of the state-space LPV models obtained using the proposed SIME while requiring minimal hyperparameters tuning. 
    more » « less
  3. Abstract A Bayesian method is proposed for variable selection in high-dimensional matrix autoregressive models which reflects and exploits the original matrix structure of data to (a) reduce dimensionality and (b) foster interpretability of multidimensional relationship structures. A compact form of the model is derived which facilitates the estimation procedure and two computational methods for the estimation are proposed: a Markov chain Monte Carlo algorithm and a scalable Bayesian EM algorithm. Being based on the spike-and-slab framework for fast posterior mode identification, the latter enables Bayesian data analysis of matrix-valued time series at large scales. The theoretical properties, comparative performance, and computational efficiency of the proposed model is investigated through simulated examples and an application to a panel of country economic indicators. 
    more » « less
  4. Approximate confidence distribution computing (ACDC) offers a new take on the rapidly developing field of likelihood-free inference from within a frequentist framework. The appeal of this computational method for statistical inference hinges upon the concept of a confidence distribution, a special type of estimator which is defined with respect to the repeated sampling principle. An ACDC method provides frequentist validation for computational inference in problems with unknown or intractable likelihoods. The main theoretical contribution of this work is the identification of a matching condition necessary for frequentist validity of inference from this method. In addition to providing an example of how a modern understanding of confidence distribution theory can be used to connect Bayesian and frequentist inferential paradigms, we present a case to expand the current scope of so-called approximate Bayesian inference to include non-Bayesian inference by targeting a confidence distribution rather than a posterior. The main practical contribution of this work is the development of a data-driven approach to drive ACDC in both Bayesian or frequentist contexts. The ACDC algorithm is data-driven by the selection of a data-dependent proposal function, the structure of which is quite general and adaptable to many settings. We explore three numerical examples that both verify the theoretical arguments in the development of ACDC and suggest instances in which ACDC outperform approximate Bayesian computing methods computationally. 
    more » « less
  5. Aleatoric uncertainty captures the inherent randomness of the data, such as measurement noise. In Bayesian regression, we often use a Gaussian observation model, where we control the level of aleatoric uncertainty with a noise variance parameter. By contrast, for Bayesian classification we use a categorical distribution with no mechanism to represent our beliefs about aleatoric uncertainty. Our work shows that explicitly accounting for aleatoric uncertainty significantly improves the performance of Bayesian neural networks. We note that many standard benchmarks, such as CIFAR, have essentially no aleatoric uncertainty. Moreover, we show data augmentation in approximate inference has the effect of softening the likelihood, leading to underconfidence and profoundly misrepresenting our honest beliefs about aleatoric uncertainty. Accordingly, we find that a cold posterior, tempered by a power greater than one, often more honestly reflects our beliefs about aleatoric uncertainty than no tempering -- providing an explicit link between data augmentation and cold posteriors. We show that we can match or exceed the performance of posterior tempering by using a Dirichlet observation model, where we explicitly control the level of aleatoric uncertainty, without any need for tempering. 
    more » « less