skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Semantic Network Interpretation
Network interpretation as an effort to reveal the features learned by a network remains largely visualization-based. In this paper, our goal is to tackle semantic network interpretation at both filter and decision level. For filter-level interpretation, we represent the concepts a filter encodes with a probability distribution of visual attributes. The decision-level interpretation is achieved by textual summarization that generates an explanatory sentence containing clues behind a network’s decision. A Bayesian inference algorithm is proposed to automatically associate filters and network decisions with visual attributes. Human study confirms that the semantic interpretation is a beneficial alternative or complement to visualization methods. We demonstrate the crucial role that semantic network interpretation can play in understanding a network’s failure patterns. More importantly, semantic network interpretation enables a better understanding of the correlation between a model’s performance and its distribution metrics like filter selectivity and concept sparseness.  more » « less
Award ID(s):
1651832
PAR ID:
10416682
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)
Page Range / eLocation ID:
400 to 409
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In the field of visualization, understanding users’ analytical reasoning is important for evaluating the effectiveness of visualization applications. Several studies have been conducted to capture and analyze user interactions to comprehend this reasoning process. However, few have successfully linked these interactions to users’ reasoning processes. This paper introduces an approach that addresses the limitation by correlating semantic user interactions with analysis decisions using an interactive wire transaction analysis system and a visual state transition matrix, both designed as visual analytics applications. The system enables interactive analysis for evaluating financial fraud in wire transactions. It also allows mapping captured user interactions and analytical decisions back onto the visualization to reveal their decision differences. The visual state transition matrix further aids in understanding users’ analytical flows, revealing their decision-making processes. Classification machine learning algorithms are applied to evaluate the effectiveness of our approach in understanding users’ analytical reasoning process by connecting the captured semantic user interactions to their decisions (i.e., suspicious, not suspicious, and inconclusive) on wire transactions. With the algorithms, an average of 72% accuracy is determined to classify the semantic user interactions. For classifying individual decisions, the average accuracy is 70%. Notably, the accuracy for classifying ‘inconclusive’ decisions is 83%. Overall, the proposed approach improves the understanding of users’ analytical decisions and provides a robust method for evaluating user interactions in visualization tools. 
    more » « less
  2. Human scene categorization is rapid and robust, but we have little understanding of how individual features contribute to categorization, nor the time scale of their contribution. This issue is compounded by the non- independence of the many candidate features. Here, we used singular value decomposition to orthogonalize 11 different scene descriptors that included both visual and semantic features. Using high-density EEG and regression analyses, we observed that most explained variability was carried by a late layer of a deep convolutional neural network, as well as a model of a scene’s functions given by the American Time Use Survey. Furthermore, features that explained more variance also tended to explain earlier variance. These results extend previous large-scale behavioral results showing the importance of functional features for scene categorization. Furthermore, these results fail to support models of visual perception that are encapsulated from higher-level cognitive attributes. 
    more » « less
  3. Deep neural networks have been shown to be fooled rather easily using adversarial attack algorithms. Practical methods such as adversarial patches have been shown to be extremely effective in causing misclassification. However, these patches are highlighted using standard network interpretation algorithms, thus revealing the identity of the adversary. We show that it is possible to create adversarial patches which not only fool the prediction, but also change what we interpret regarding the cause of the prediction. Moreover, we introduce our attack as a controlled setting to measure the accuracy of interpretation algorithms. We show this using extensive experiments for Grad-CAM interpretation that transfers to occluding patch interpretation as well. We believe our algorithms can facilitate developing more robust network interpretation tools that truly explain the network’s underlying decision making process. 
    more » « less
  4. We study the underexplored but fundamental vision problem of machine understanding of abstract freehand scene sketches We introduce a sketch encoder that results in semantically- aware feature space, which we evaluate by testing its performance on a semantic sketch seg- mentation task. To train our model we rely only on the availability of bitmap sketches with their brief captions and do not require any pixel-level annotations. To obtain generalization to a large set of sketches and categories, we build on a vision transformer encoder pretrained with the CLIP model. We freeze the text encoder and perform visual-prompt tuning of the visual encoder branch while introducing a set of critical modifications. Firstly, we augment the classical key-query (k-q) self-attention blocks with value-value (v-v) self-attention blocks. Central to our model is a two-level hierarchical network design that enables efficient semantic disentanglement: The first level ensures holistic scene sketch encoding, and the second level focuses on individual categories. We, then, in the second level of the hierarchy, introduce a cross-attention between textual and visual branches. Our method outperforms zero-shot CLIP pixel accuracy of segmentation results by 37 points, reaching an accuracy of 85.5% on the FS-COCO sketch dataset. Finally, we conduct a user study that allows us to identify further improvements needed over our method to reconcile machine and human understanding of scene sketches. 
    more » « less
  5. Abstract Generative adversarial networks (GAN) have witnessed tremendous growth in recent years, demonstrating wide applicability in many domains. However, GANs remain notoriously difficult for people to interpret, particularly for modern GANs capable of generating photo‐realistic imagery. In this work we contribute a visual analytics approach for GAN interpretability, where we focus on the analysis and visualization of GAN disentanglement. Disentanglement is concerned with the ability to control content produced by a GAN along a small number of distinct, yet semantic, factors of variation. The goal of our approach is to shed insight on GAN disentanglement, above and beyond coarse summaries, instead permitting a deeper analysis of the data distribution modeled by a GAN. Our visualization allows one to assess a single factor of variation in terms of groupings and trends in the data distribution, where our analysis seeks to relate the learned representation space of GANs with attribute‐based semantic scoring of images produced by GANs. Through use‐cases, we show that our visualization is effective in assessing disentanglement, allowing one to quickly recognize a factor of variation and its overall quality. In addition, we show how our approach can highlight potential dataset biases learned by GANs. 
    more » « less