skip to main content


Title: Modeling Biological Immunity to Adversarial Examples
While deep learning continues to permeate through all fields of signal processing and machine learning, a critical exploit in these frameworks exists and remains unsolved. These exploits, or adversarial examples, are a type of signal attack that can change the output class of a classifier by perturbing the stimulus signal by an imperceptible amount. The attack takes advantage of statistical irregularities within the training data, where the added perturbations can move the image across deep learning decision boundaries. What is even more alarming is the transferability of these attacks to different deep learning models and architectures. This means a successful attack on one model has adversarial effects on other, unrelated models. In a general sense, adversarial attack through perturbations is not a machine learning vulnerability. Human and biological vision can also be fooled by various methods, i.e. mixing high and low frequency images together, by altering semantically related signals, or by sufficiently distorting the input signal. However, the amount and magnitude of such a distortion required to alter biological perception is at a much larger scale. In this work, we explored this gap through the lens of biology and neuroscience in order to understand the robustness exhibited in human perception. Our experiments show that by leveraging sparsity and modeling the biological mechanisms at a cellular level, we are able to mitigate the effect of adversarial alterations to the signal that have no perceptible meaning. Furthermore, we present and illustrate the effects of top-down functional processes that contribute to the inherent immunity in human perception in the context of exploiting these properties to make a more robust machine vision system.  more » « less
Award ID(s):
1954364
NSF-PAR ID:
10232438
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Page Range / eLocation ID:
4665 to 4674
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Adversarial images are a class of images that have been slightly altered by very specific noise to change the way a deep learning neural network classifies the image. In many cases, this particular noise is imperceptible to the human vision system and thus presents a vulnerability of significant concern to the machine learning and artificial intelligence community. Research towards mitigating this type of attack has taken many forms, one of which is to filter or post process the image before classifying the image with a deep neural network. Techniques such as smoothing, filtering, and compression have been used with varying levels of success. In our work, we explored the use of a neuromorphic software and hardware approach as a protection against adversarial image attack. The algorithm governing our neuromorphic approach is based upon sparse coding. Our sparse coding approach is solved using a dynamic system of equations that models biological low level vision. Our quantitative and qualitative results show that a sparse coding reconstruction is remarkably invariant to changes in sparsity and reconstruction error with respect to classification accuracy. Furthermore, our approach is able to maintain low reconstruction errors without sacrificing classification performance. 
    more » « less
  2. Abstract

    Deep neural networks (DNNs) are widely used to handle many difficult tasks, such as image classification and malware detection, and achieve outstanding performance. However, recent studies on adversarial examples, which have maliciously undetectable perturbations added to their original samples that are indistinguishable by human eyes but mislead the machine learning approaches, show that machine learning models are vulnerable to security attacks. Though various adversarial retraining techniques have been developed in the past few years, none of them is scalable. In this paper, we propose a new iterative adversarial retraining approach to robustify the model and to reduce the effectiveness of adversarial inputs on DNN models. The proposed method retrains the model with both Gaussian noise augmentation and adversarial generation techniques for better generalization. Furthermore, the ensemble model is utilized during the testing phase in order to increase the robust test accuracy. The results from our extensive experiments demonstrate that the proposed approach increases the robustness of the DNN model against various adversarial attacks, specifically, fast gradient sign attack, Carlini and Wagner (C&W) attack, Projected Gradient Descent (PGD) attack, and DeepFool attack. To be precise, the robust classifier obtained by our proposed approach can maintain a performance accuracy of 99% on average on the standard test set. Moreover, we empirically evaluate the runtime of two of the most effective adversarial attacks, i.e., C&W attack and BIM attack, to find that the C&W attack can utilize GPU for faster adversarial example generation than the BIM attack can. For this reason, we further develop a parallel implementation of the proposed approach. This parallel implementation makes the proposed approach scalable for large datasets and complex models.

     
    more » « less
  3. Neural models enjoy widespread use across a variety of tasks and have grown to become crucial components of many industrial systems. Despite their effectiveness and ex- tensive popularity, they are not without their exploitable flaws. Initially applied to computer vision systems, the generation of adversarial examples is a process in which seemingly imper- ceptible perturbations are made to an image, with the purpose of inducing a deep learning based classifier to misclassify the image. Due to recent trends in speech processing, this has become a noticeable issue in speech recognition models. In late 2017, an attack was shown to be quite effective against the Speech Commands classification model. Limited-vocabulary speech classifiers, such as the Speech Commands model, are used quite frequently in a variety of applications, particularly in managing automated attendants in telephony contexts. As such, adversarial examples produced by this attack could have real-world consequences. While previous work in defending against these adversarial examples has investigated using audio preprocessing to reduce or distort adversarial noise, this work explores the idea of flooding particular frequency bands of an audio signal with random noise in order to detect adversarial examples. This technique of flooding, which does not require retraining or modifying the model, is inspired by work done in computer vision and builds on the idea that speech classifiers are relatively robust to natural noise. A combined defense incorporating 5 different frequency bands for flooding the signal with noise outperformed other existing defenses in the audio space, detecting adversarial examples with 91.8% precision and 93.5% recall. 
    more » « less
  4. Research in the upcoming field of adversarial ML has revealed that machine learning, especially deep learning, is highly vulnerable to imperceptible adversarial perturbations, both in the domain of vision as well as speech. This has induced an urgent need to devise fast and practical approaches to secure deep learning models from adversarial attacks, so that they can be safely deployed in real-world applications. In this showcase, we put forth the idea of compression as a viable solution to defend against adversarial attacks across modalities. Since most of these attacks depend on the gradient of the model to craft an adversarial instance, compression, which is usually non-differentiable, denies a useful gradient to the attacker. In the vision domain we have JPEG compression, and in the audio domain we have MP3 compression and AMR encoding -- all widely adopted techniques that have very fast implementations on most platforms, and can be feasibly leveraged as defenses. We will show the effectiveness of these techniques against adversarial attacks through live demonstrations, both for vision as well as speech. These demonstrations would include real-time computation of adversarial perturbations for images and audio, as well as interactive application of compression for defense. We would invite and encourage the audience to experiment with their own images and audio samples during the demonstrations. This work was undertaken jointly by researchers from Georgia Institute of Technology and Intel Corporation. 
    more » « less
  5. Online advertisers have been quite successful in circumventing traditional adblockers that rely on manually curated rules to detect ads. As a result, adblockers have started to use machine learning (ML) classifiers for more robust detection and blocking of ads. Among these, AdGraph which leverages rich contextual information to classify ads, is arguably, the state of the art ML-based adblocker. In this paper, we present a4, a tool that intelligently crafts adversarial ads to evade AdGraph. Unlike traditional adversarial examples in the computer vision domain that can perturb any pixels (i.e., unconstrained), adversarial ads generated by a4 are actionable in the sense that they preserve the application semantics of the web page. Through a series of experiments we show that a4 can bypass AdGraph about 81% of the time, which surpasses the state-of-the-art attack by a significant margin of 145.5%, with an overhead of <20% and perturbations that are visually imperceptible in the rendered webpage. We envision that a4’s framework can be used to potentially launch adversarial attacks against other ML-based web applications. 
    more » « less