Visualizing and Analyzing the Topology of Neuron Activations in Deep Adversarial Training.

Zhou, Youjia; Zhou, Yi; Ding, Jie; Wang, Bei.

Citation Details

This content will become publicly available on July 1, 2024

Visualizing and Analyzing the Topology of Neuron Activations in Deep Adversarial Training.

Deep models are known to be vulnerable to data adversarial attacks, and many adversarial training techniques have been developed to improve their adversarial robustness. While data adversaries attack model predictions through modifying data, little is known about their impact on the neuron activations produced by the model, which play a crucial role in determining the model’s predictions and interpretability. In this work, we aim to develop a topological understanding of adversarial training to enhance its interpretability. We analyze the topological structure—in particular, mapper graphs—of neuron activations of data samples produced by deep adversarial training. Each node of a mapper graph represents a cluster of activations, and two nodes are connected by an edge if their corresponding clusters have a nonempty intersection. We provide an interactive visualization tool that demonstrates the utility of our topological framework in exploring the activation space. We found that stronger attacks make the data samples more indistinguishable in the neuron activation space that leads to a lower accuracy. Our tool also provides a natural way to identify the vulnerable data samples that may be useful in improving model robustness. more »

Award ID(s):: 2134223 2205418

NSF-PAR ID:: 10428624

Author(s) / Creator(s):: Zhou, Youjia; Zhou, Yi; Ding, Jie; Wang, Bei.

Date Published:: 2023-07-01

Journal Name:: Proceedings of the Topology, Algebra, and Geometry in Machine Learning (TAGML) Workshop at ICML

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 1, 2024
Conference Paper:
The DOI is not currently available.

More Like this