skip to main content


Title: Robust Learning via Ensemble Density Propagation in Deep Neural Networks
Learning in uncertain, noisy, or adversarial environments is a challenging task for deep neural networks (DNNs). We propose a new theoretically grounded and efficient approach for robust learning that builds upon Bayesian estimation and Variational Inference. We formulate the problem of density propagation through layers of a DNN and solve it using an Ensemble Density Propagation (EnDP) scheme. The EnDP approach allows us to propagate moments of the variational probability distribution across the layers of a Bayesian DNN, enabling the estimation of the mean and covariance of the predictive distribution at the output of the model. Our experiments using MNIST and CIFAR-10 datasets show a significant improvement in the robustness of the trained models to random noise and adversarial attacks.  more » « less
Award ID(s):
1903466 2008690 2234836
NSF-PAR ID:
10259923
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP)
Page Range / eLocation ID:
1 to 6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep neural networks (DNNs) have surpassed human-level accuracy in various learning tasks. However, unlike humans who have a natural cognitive intuition for probabilities, DNNs cannot express their uncertainty in the output decisions. This limits the deployment of DNNs in mission critical domains, such as warfighter decision-making or medical diagnosis. Bayesian inference provides a principled approach to reason about model’s uncertainty by estimating the posterior distribution of the unknown parameters. The challenge in DNNs remains the multi-layer stages of non-linearities, which make the propagation of high-dimensional distributions mathematically intractable. This paper establishes the theoretical and algorithmic foundations of uncertainty or belief propagation by developing new deep learning models named PremiUm-CNNs (Propagating Uncertainty in Convolutional Neural Networks). We introduce a tensor normal distribution as a prior over convolutional kernels and estimate the variational posterior by maximizing the evidence lower bound (ELBO). We start by deriving the first-order mean-covariance propagation framework. Later, we develop a framework based on the unscented transformation (correct at least up to the second-order) that propagates sigma points of the variational distribution through layers of a CNN. The propagated covariance of the predictive distribution captures uncertainty in the output decision. Comprehensive experiments conducted on diverse benchmark datasets demonstrate: 1) superior robustness against noise and adversarial attacks, 2) self-assessment through predictive uncertainty that increases quickly with increasing levels of noise or attacks, and 3) an ability to detect a targeted attack from ambient noise. 
    more » « less
  2. Model confidence or uncertainty is critical in autonomous systems as they directly tie to the safety and trustworthiness of the system. The quantification of uncertainty in the output decisions of deep neural networks (DNNs) is a challenging problem. The Bayesian framework enables the estimation of the predictive uncertainty by introducing probability distributions over the (unknown) network weights; however, the propagation of these high-dimensional distributions through multiple layers and non-linear transformations is mathematically intractable. In this work, we propose an extended variational inference (eVI) framework for convolutional neural network (CNN) based on tensor Normal distributions (TNDs) defined over convolutional kernels. Our proposed eVI framework propagates the first two moments (mean and covariance) of these TNDs through all layers of the CNN. We employ first-order Taylor series linearization to approximate the mean and covariances passing through the non-linear activations. The uncertainty in the output decision is given by the propagated covariance of the predictive distribution. Furthermore, we show, through extensive simulations on the MNIST and CIFAR-10 datasets, that the CNN becomes more robust to Gaussian noise and adversarial attacks. 
    more » « less
  3. Powerful adversarial attack methods are vital for understanding how to construct robust deep neural networks (DNNs) and thoroughly testing defense techniques. In this project, we propose a black-box adversarial attack algorithm that can defeat both vanilla DNNs and those generated by various defense techniques developed recently. Instead of searching for an “optimal” adversarial example for a benign input to a targeted DNN, our algorithm finds a probability density distribution over a small region centered around the input, such that a sample drawn from this distribution is likely an adversarial example, without the need of accessing the DNN’s internal layers or weights. Our approach is universal as it can successfully attack different neural networks by a single algorithm. It is also strong; according to the testing against 2 vanilla DNNs and 13 defended ones, it outperforms state-of-the-art black-box or white-box attack methods for most test cases. Additionally, our results reveal that adversarial training remains one of the best defense techniques, and the adversarial examples are not as transferable across defended DNNs as them across vanilla DNNs. 
    more » « less
  4. In this work, we propose a novel approach for the real-time estimation of chip-level spatial power maps for commercial Google Coral M.2 TPU chips based on a machine-learning technique for the first time. The new method can enable the development of more robust runtime power and thermal control schemes to take advantage of spatial power information such as hot spots that are otherwise not available. Different from the existing commercial multi-core processors in which real-time performance-related utilization information is available, the TPU from Google does not have such information. To mitigate this problem, we propose to use features that are related to the workloads of running different deep neural networks (DNN) such as the hyperparameters of DNN and TPU resource information generated by the TPU compiler. The new approach involves the offline acquisition of accurate spatial and temporal temperature maps captured from an external infrared thermal imaging camera under nominal working conditions of a chip. To build the dynamic power density map model, we apply generative adversarial networks (GAN) based on the workload-related features. Our study shows that the estimated total powers match the manufacturer's total power measurements extremely well. Experimental results further show that the predictions of power maps are quite accurate, with the RMSE of only 4.98\rm mW/mm^2, or 2.6\% of the full-scale error. The speed of deploying the proposed approach on an Intel Core i7-10710U is as fast as 6.9ms, which is suitable for real-time estimation. 
    more » « less
  5. One essential task in practice is to quantify and improve the reliability of an infrastructure network in terms of the connectivity of network components (i.e., all-terminal reliability). However, as the number of edges and nodes in the network increases, computing the all-terminal network reliability using exact algorithms becomes prohibitive. This is extremely burdensome in network designs requiring repeated computations. In this paper, we propose a novel machine learning-based framework for evaluating and improving all-terminal network reliability using Deep Neural Networks (DNNs) and Deep Reinforcement Learning (DRL). With the help of DNNs and Stochastic Variational Inference (SVI), we can effectively compute the all-terminal reliability for different network configurations in DRL. Furthermore, the Bayesian nature of the proposed SVI+DNN model allows for quantifying the estimation uncertainty while enforcing regularization and reducing overfitting. Our numerical experiment and case study show that the proposed framework provides an effective tool for infrastructure network reliability improvement. 
    more » « less