skip to main content


Title: Exploring robust architectures for deep artificial neural networks
Abstract

The architectures of deep artificial neural networks (DANNs) are routinely studied to improve their predictive performance. However, the relationship between the architecture of a DANN and its robustness to noise and adversarial attacks is less explored, especially in computer vision applications. Here we investigate the relationship between the robustness of DANNs in a vision task and their underlying graph architectures or structures. First we explored the design space of architectures of DANNs using graph-theoretic robustness measures and transformed the graphs to DANN architectures using various image classification tasks. Then we explored the relationship between the robustness of trained DANNs against noise and adversarial attacks and their underlying architectures. We show that robustness performance of DANNs can be quantified before training using graph structural properties such as topological entropy and Olivier-Ricci curvature, with the greatest reliability for complex tasks and large DANNs. Our results can also be applied for tasks other than computer vision such as natural language processing and recommender systems.

 
more » « less
Award ID(s):
2234836 1903466
NSF-PAR ID:
10386072
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Communications Engineering
Volume:
1
Issue:
1
ISSN:
2731-3395
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Synthetic aperture radar (SAR) image classification is a challenging problem due to the complex imaging mechanism as well as the random speckle noise, which affects radar image interpretation. Recently, convolutional neural networks (CNNs) have been shown to outperform previous state-of-the-art techniques in computer vision tasks owing to their ability to learn relevant features from the data. However, CNNs in particular and neural networks, in general, lack uncertainty quantification and can be easily deceived by adversarial attacks. This paper proposes Bayes-SAR Net, a Bayesian CNN that can perform robust SAR image classification while quantifying the uncertainty or confidence of the network in its decision. Bayes-SAR Net propagates the first two moments (mean and covariance) of the approximate posterior distribution of the network parameters given the data and obtains a predictive mean and covariance of the classification output. Experiments, using the benchmark datasets Flevoland and Oberpfaffenhofen, show superior performance and robustness to Gaussian noise and adversarial attacks, as compared to the SAR-Net homologue. Bayes-SAR Net achieves a test accuracy that is around 10% higher in the case of adversarial perturbation (levels > 0.05). 
    more » « less
  2. The unprecedented growth in mobile systems has transformed the way we approach everyday computing. Unfortunately, the emergence of a sophisticated type of malware known as ransomware poses a great threat to consumers of this technology. Traditional research on mobile malware detection has focused on approaches that rely on analyzing bytecode for uncovering malicious apps. However, cybercriminals can bypass such methods by embedding malware directly in native machine code, making traditional methods inadequate. Another challenge that detection solutions face is scalability. The sheer number of malware variants released every year makes it difficult for solutions to efficiently scale their coverage. To address these concerns, this work presents RansomShield, an energy-efficient solution that leverages CNNs to detect ransomware. We evaluate CNN architectures that have been known to perform well on computer vision tasks and examine their suitability for ransomware detection. We show that systematically converting native instructions from Android apps into images using space-filling curve visualization techniques enable CNNs to reliably detect ransomware with high accuracy. We characterize the robustness of this approach across ARM and x86 architectures and demonstrate the effectiveness of this solution across heterogeneous platforms including smartphones and chromebooks. We evaluate the suitability of different models for mobile systems by comparing their energy demands using different platforms. In addition, we present a CNN introspection framework that determines the important features that are needed for ransomware detection. Finally, we evaluate the robustness of this solution against adversarial machine learning (AML) attacks using state-of-the-art Android malware dataset. 
    more » « less
  3. Vision transformers (ViTs) have recently set off a new wave in neural architecture design thanks to their record-breaking performance in various vision tasks. In parallel, to fulfill the goal of deploying ViTs into real-world vision applications, their robustness against potential malicious attacks has gained increasing attention. In particular, recent works show that ViTs are more robust against adversarial attacks as compared with convolutional neural networks (CNNs), and conjecture that this is because ViTs focus more on capturing global interactions among different input/feature patches, leading to their improved robustness to local perturbations imposed by adversarial attacks. In this work, we ask an intriguing question: “Under what kinds of perturbations do ViTs become more vulnerable learners compared to CNNs?” Driven by this question, we first conduct a comprehensive experiment regarding the robustness of both ViTs and CNNs under various existing adversarial attacks to understand the underlying reason favoring their robustness. Based on the drawn insights, we then propose a dedicated attack framework, dubbed Patch-Fool, that fools the self-attention mechanism by attacking its basic component (i.e., a single patch) with a series of attention-aware optimization techniques. Interestingly, our Patch-Fool framework shows for the first time that ViTs are not necessarily more robust than CNNs against adversarial perturbations. In particular, we find that ViTs are more vulnerable learners compared with CNNs against our Patch-Fool attack which is consistent across extensive experiments, and the observations from Sparse/Mild Patch-Fool, two variants of Patch-Fool, indicate an intriguing insight that the perturbation density and strength on each patch seem to be the key factors that influence the robustness ranking between ViTs and CNNs. It can be expected that our Patch-Fool framework will shed light on both future architecture designs and training schemes for robustifying ViTs towards their real-world deployment. Our codes are available at https://github.com/RICE-EIC/Patch-Fool. 
    more » « less
  4. null (Ed.)
    Convolutional neural networks (CNNs) have achieved state-of- the-art performance on various tasks in computer vision. However, recent studies demonstrate that these models are vulnerable to carefully crafted adversarial samples and suffer from a significant performance drop when predicting them. Many methods have been proposed to improve adversarial robustness (e.g., adversarial training and new loss functions to learn adversarially robust feature representations). Here we offer a unique insight into the predictive behavior of CNNs that they tend to misclassify adversarial samples into the most probable false classes. This inspires us to propose a new Probabilistically Compact (PC) loss with logit constraints which can be used as a drop-in replacement for cross-entropy (CE) loss to improve CNN’s adversarial robustness. Specifically, PC loss enlarges the probability gaps between true class and false classes meanwhile the logit constraints prevent the gaps from being melted by a small perturbation. We extensively compare our method with the state-of-the-art using large scale datasets under both white-box and black-box attacks to demonstrate its effectiveness. The source codes are available at https://github.com/xinli0928/PC-LC. 
    more » « less
  5. Neural networks (NN) has been adopted by brain-computer interfaces (BCI) to encode brain signals acquired using electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). However, it has been found that NN models are vulnerable to adversarial examples, i.e., corrupted samples with imperceptible noise. Once attacked, it could impact medical diagnosis and patients’ quality of life. While early work focuses on interference using external devices at the time of signal acquisition, recent research shifts to collected signals, features, and learning models under various attack modes (e.g., white-, grey-, and black-box). However, existing work only considers single-modality attacks and ignores the topological relationships among different observations, e.g., samples having strong similarities. Different from previous approaches, we introduce graph neural networks (GNN) to multimodal BCI-based classification and explore its performance and robustness against adversarial attacks. This study will evaluate the robustness of NN models with and without graph knowledge on both single and multimodal data. 
    more » « less