Influence-Directed Explanations for Deep Convolutional Networks

Leino, Klas; Sen, Shayak; Datta, Anupam; Fredrikson, Matt; Li, Linyi

doi:10.1109/TEST.2018.8624792

Citation Details

Influence-Directed Explanations for Deep Convolutional Networks

We study the problem of explaining a rich class of behavioral properties of deep neural networks. Distinctively, our influence-directed explanations approach this problem by peering inside the network to identify neurons with high influence on a quantity and distribution of interest, using an axiomatically-justified influence measure, and then providing an interpretation for the concepts these neurons represent. We evaluate our approach by demonstrating a number of its unique capabilities on convolutional neural networks trained on ImageNet. Our evaluation demonstrates that influence-directed explanations (1) identify influential concepts that generalize across instances, (2) can be used to extract the “essence” of what the network learned about a class, and (3) isolate individual features the network uses to make decisions and distinguish related classes. more »

Award ID(s):: 1704845

PAR ID:: 10095680

Author(s) / Creator(s):: Leino, Klas; Sen, Shayak; Datta, Anupam; Fredrikson, Matt; Li, Linyi

Date Published:: 2018-10-01

Journal Name:: IEEE International Test Conference (ITC)

Page Range / eLocation ID:: 1 to 8

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/TEST.2018.8624792

More Like this