skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Graph interpolating activation improves both natural and robust accuracies in data-efficient deep learning
Improving the accuracy and robustness of deep neural nets (DNNs) and adapting them to small training data are primary tasks in deep learning (DL) research. In this paper, we replace the output activation function of DNNs, typically the data-agnostic softmax function, with a graph Laplacian-based high-dimensional interpolating function which, in the continuum limit, converges to the solution of a Laplace–Beltrami equation on a high-dimensional manifold. Furthermore, we propose end-to-end training and testing algorithms for this new architecture. The proposed DNN with graph interpolating activation integrates the advantages of both deep learning and manifold learning. Compared to the conventional DNNs with the softmax function as output activation, the new framework demonstrates the following major advantages: First, it is better applicable to data-efficient learning in which we train high capacity DNNs without using a large number of training data. Second, it remarkably improves both natural accuracy on the clean images and robust accuracy on the adversarial images crafted by both white-box and black-box adversarial attacks. Third, it is a natural choice for semi-supervised learning. This paper is a significant extension of our earlier work published in NeurIPS, 2018. For reproducibility, the code is available at https://github.com/BaoWangMath/DNN-DataDependentActivation .  more » « less
Award ID(s):
1952339 1924935
PAR ID:
10220577
Author(s) / Creator(s):
;
Date Published:
Journal Name:
European Journal of Applied Mathematics
ISSN:
0956-7925
Page Range / eLocation ID:
1 to 30
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We present a novel deep neural network (DNN) training scheme and resistive RAM (RRAM) in-memory computing (IMC) hardware evaluation towards achieving high accuracy against RRAM device/array variations and enhanced robustness against adversarial input attacks. We present improved IMC inference accuracy results evaluated on state-of-the-art DNNs including ResNet-18, AlexNet, and VGG with binary, 2-bit, and 4-bit activation/weight precision for the CIFAR-10 dataset. These DNNs are evaluated with measured noise data obtained from three different RRAM-based IMC prototype chips. Across these various DNNs and IMC chip measurements, we show that our proposed hardware noise-aware DNN training consistently improves DNN inference accuracy for actual IMC hardware, up to 8% accuracy improvement for the CIFAR-10 dataset. We also analyze the impact of our proposed noise injection scheme on the adversarial robustness of ResNet-18 DNNs with 1-bit, 2-bit, and 4-bit activation/weight precision. Our results show up to 6% improvement in the robustness to black-box adversarial input attacks. 
    more » « less
  2. Deep Neural Networks (DNNs) need to be both efficient and robust for practical uses. Quantization and structure simplification are promising ways to adapt DNNs to mobile devices, and adversarial training is one of the most successful methods to train robust DNNs. In this work, we aim to realize both advantages by applying a convergent relaxation quantization algorithm, i.e., Binary-Relax (BR), to an adversarially trained robust model, i.e. the ResNets Ensemble via Feynman-Kac Formalism (EnResNet). We discover that high-precision quantization, such as ternary (tnn) or 4-bit, produces sparse DNNs. However, this sparsity is unstructured under adversarial training. To solve the problems that adversarial training jeopardizes DNNs’ accuracy on clean images and break the structure of sparsity, we design a trade-off loss function that helps DNNs preserve natural accuracy and improve channel sparsity. With our newly designed trade-off loss function, we achieve both goals with no reduction of resistance under weak attacks and very minor reduction of resistance under strong adversarial attacks. Together with our model and algorithm selections and loss function design, we provide an integrated approach to produce robust DNNs with high efficiency and accuracy. Furthermore, we provide a missing benchmark on robustness of quantized models. 
    more » « less
  3. null (Ed.)
    Adversarial training is a popular defense strategy against attack threat models with bounded Lp norms. However, it often degrades the model performance on normal images and the defense does not generalize well to novel attacks. Given the success of deep generative models such as GANs and VAEs in characterizing the underlying manifold of images, we investigate whether or not the aforementioned problems can be remedied by exploiting the underlying manifold information. To this end, we construct an "On-Manifold ImageNet" (OM-ImageNet) dataset by projecting the ImageNet samples onto the manifold learned by StyleGSN. For this dataset, the underlying manifold information is exact. Using OM-ImageNet, we first show that adversarial training in the latent space of images improves both standard accuracy and robustness to on-manifold attacks. However, since no out-of-manifold perturbations are realized, the defense can be broken by Lp adversarial attacks. We further propose Dual Manifold Adversarial Training (DMAT) where adversarial perturbations in both latent and image spaces are used in robustifying the model. Our DMAT improves performance on normal images, and achieves comparable robustness to the standard adversarial training against Lp attacks. In addition, we observe that models defended by DMAT achieve improved robustness against novel attacks which manipulate images by global color shifts or various types of image filtering. Interestingly, similar improvements are also achieved when the defended models are tested on out-of-manifold natural images. These results demonstrate the potential benefits of using manifold information in enhancing robustness of deep learning models against various types of novel adversarial attacks. 
    more » « less
  4. Variability-induced accuracy degradation of RRAM based DNNs is of great concern due to their significant potential for use in future energy-efficient machine learning architectures. To address this, we propose a two-step process. First, an enhanced testing procedure is used to predict DNN accuracy from a set of compact test stimuli (images). This test response (signature) is simply the concatenated vectors of output neurons of intermediate final DNN layers over the compact test images applied. DNNs with a predicted accuracy below a threshold are then tuned based on this signature vector. Using a clustering based approach, the signature is mapped to the optimal tuning parameter values of the DNN (determined using off-line training of the DNN via backpropagation) in a single step, eliminating any post-manufacture training of the DNN weights (expensive). The tuning parameters themselves consist of the gains and offsets of the ReLU activation of neurons of the DNN on a per-layer basis and can be tuned digitally. Tuning is achieved in less than a second of tuning time, with yield improvements of over 45% with a modest accuracy reduction of 4% compared to digital DNNs. 
    more » « less
  5. Variability-induced accuracy degradation of RRAMbased DNNs is of great concern due to their significant potential for use in future energy-efficient machine learning architectures. To address this, we propose a two-step process. First, an enhanced testing procedure is used to predict DNN accuracy from a set of compact test stimuli (images). This test response (signature) is simply the concatenated vectors of output neurons of intermediate and final DNN layers over the compact test images applied. DNNs with a predicted accuracy below a threshold are then tuned based on this signature vector. Using a clustering based approach, the signature is mapped to the optimal tuning parameter values of the DNN (determined using off-line training of the DNN via backpropagation) in a single step, eliminating any post-manufacture training of the DNN weights (expensive). The tuning parameters themselves consist of the gains and offsets of the ReLU activation of neurons of the DNN on a per-layer basis and can be tuned digitally. Tuning is achieved in less than a second of tuning time, with yield improvements of over 45% with a modest accuracy reduction of 4% compared to digital DNNs. 
    more » « less