skip to main content


Title: Towards Certificated Model Robustness Against Weight Perturbations
This work studies the sensitivity of neural networks to weight perturbations, firstly corresponding to a newly developed threat model that perturbs the neural network parameters. We propose an efficient approach to compute a certified robustness bound of weight perturbations, within which neural networks will not make erroneous outputs as desired by the adversary. In addition, we identify a useful connection between our developed certification method and the problem of weight quantization, a popular model compression technique in deep neural networks (DNNs) and a ‘must-try’ step in the design of DNN inference engines on resource constrained computing platforms, such as mobiles, FPGA, and ASIC. Specifically, we study the problem of weight quantization – weight perturbations in the non-adversarial setting – through the lens of certificated robustness, and we demonstrate significant improvements on the generalization ability of quantized networks through our robustness-aware quantization scheme.  more » « less
Award ID(s):
1929300
NSF-PAR ID:
10169579
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Increasing ocean temperatures have widespread consequences for coral reefs, one of which is coral bleaching. We analyzed a global network of associations between coral species and Symbiodiniaceae for resistance to temperature stress and robustness to perturbations. Null networks were created by changing either the physiological parameters of the nodes or the structures of the networks. We developed a bleaching model in which each link, association, is given a weight based on temperature thresholds for specific host–symbiont pairs and links are removed as temperature increases. Resistance to temperature stress was determined from the response of the networks to the bleaching model. Ecological robustness, defined by how much perturbation is needed to decrease the number of nodes by 50%, was determined for multiple removal models that considered traits of the hosts, symbionts, and their associations. Network resistance to bleaching and robustness to perturbations differed from the null networks and varied across spatial scales, supporting that thermal tolerances, local association patterns, and environment play an important role in network persistence. Networks were more robust to attacks on associations than to attacks on species. Although the global network was fairly robust to random link removals, when links are removed according to the bleaching model, robustness decreases by about 20%. Specific environmental attacks, in the form of increasing temperatures, destabilize the global network of coral species and Symbiodiniaceae. On a global scale, the network was more robust to removals of links with susceptible Symbiodiniaceae than it was to removals of links with susceptible hosts. Thus, the symbionts convey more stability to the symbiosis than the hosts when the system is under an environmental attack. However, our results also provide evidence that the environment of the networks affects robustness to link perturbations. Our work shows that ecological resistance and robustness can be assessed through network analysis that considers specific biological traits and functional weaknesses. The global network of associations between corals and Symbiodiniaceae and its distribution of thermal tolerances are non‐random, and the evolution of this architecture has led to higher sensitivity to environmental perturbations.

     
    more » « less
  2. We consider the post-training quantization problem, which discretizes the weights of pre-trained deep neural networks without re-training the model. We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each weight using a single low precision number. Computationally, we construct the multipoint quantization with an efficient greedy selection procedure, and adaptively decides the number of low precision points on each quantized weight vector based on the error of its output. This allows us to achieve higher precision levels for important weights that greatly influence the outputs, yielding an 'effect of mixed precision' but without physical mixed precision implementations (which requires specialized hardware accelerators). Empirically, our method can be implemented by common operands, bringing almost no memory and computation overhead. We show that our method outperforms a range of state-of-the-art methods on ImageNet classification and it can be generalized to more challenging tasks like PASCAL VOC object detection. 
    more » « less
  3. The robustness and vulnerability of Deep Neural Networks (DNN) are quickly becoming a critical area of interest since these models are in widespread use across real-world applications (i.e., image and audio analysis, recommendation system, natural language analysis, etc.). A DNN's vulnerability is exploited by an adversary to generate data to attack the model; however, the majority of adversarial data generators have focused on image domains with far fewer work on audio domains. More recently, audio analysis models were shown to be vulnerable to adversarial audio examples (e.g., speech command classification, automatic speech recognition, etc.). Thus, one urgent open problem is to detect adversarial audio reliably. In this contribution, we incorporate a separate and yet related DNN technique to detect adversarial audio, namely model quantization. Then we propose an algorithm to detect adversarial audio by using a DNN's quantization error. Specifically, we demonstrate that adversarial audio typically exhibits a larger activation quantization error than benign audio. The quantization error is measured using character error rates. We use the difference in errors to discriminate adversarial audio. Experiments with three the-state-of-the-art audio attack algorithms against the DeepSpeech model show our detection algorithm achieved high accuracy on the Mozilla dataset. 
    more » « less
  4. With the recent demand of deploying neural network models on mobile and edge devices, it is desired to improve the model's generalizability on unseen testing data, as well as enhance the model's robustness under fixed-point quantization for efficient deployment. Minimizing the training loss, however, provides few guarantees on the generalization and quantization performance. In this work, we fulfill the need of improving generalization and quantization performance simultaneously by theoretically unifying them under the framework of improving the model's robustness against bounded weight perturbation and minimizing the eigenvalues of the Hessian matrix with respect to model weights. We therefore propose HERO, a Hessian-enhanced robust optimization method, to minimize the Hessian eigenvalues through a gradient-based training process, simultaneously improving the generalization and quantization performance. HERO enables up to a 3.8% gain on test accuracy, up to 30% higher accuracy under 80% training label perturbation, and the best post-training quantization accuracy across a wide range of precision, including a > 10% accuracy improvement over SGD-trained models for common model architectures on various datasets. 
    more » « less
  5. Graph Neural Networks (GNNs) have drawn significant attentions over the years and been broadly applied to essential applications requiring solid robustness or vigorous security standards, such as product recommendation and user behavior modeling. Under these scenarios, exploiting GNN's vulnerabilities and further downgrading its performance become extremely incentive for adversaries. Previous attackers mainly focus on structural perturbations or node injections to the existing graphs, guided by gradients from the surrogate models. Although they deliver promising results, several limitations still exist. For the structural perturbation attack, to launch a proposed attack, adversaries need to manipulate the existing graph topology, which is impractical in most circumstances. Whereas for the node injection attack, though being more practical, current approaches require training surrogate models to simulate a white-box setting, which results in significant performance downgrade when the surrogate architecture diverges from the actual victim model. To bridge these gaps, in this paper, we study the problem of black-box node injection attack, without training a potentially misleading surrogate model. Specifically, we model the node injection attack as a Markov decision process and propose Gradient-free Graph Advantage Actor Critic, namely G2A2C, a reinforcement learning framework in the fashion of advantage actor critic. By directly querying the victim model, G2A2C learns to inject highly malicious nodes with extremely limited attacking budgets, while maintaining a similar node feature distribution. Through our comprehensive experiments over eight acknowledged benchmark datasets with different characteristics, we demonstrate the superior performance of our proposed G2A2C over the existing state-of-the-art attackers. Source code is publicly available at: https://github.com/jumxglhf/G2A2C.

     
    more » « less