skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Rationalizing Graph Neural Networks with Data Augmentation
Graph rationales are representative subgraph structures that best explain and support the graph neural network (GNN) predictions. Graph rationalization involves the joint identification of these subgraphs during GNN training, resulting in improved interpretability and generalization. GNN is widely used for node-level tasks such as paper classification and graph-level tasks such as molecular property prediction. However, on both levels, little attention has been given to GNN rationalization and the lack of training examples makes it difficult to identify the optimal graph rationales. In this work, we address the problem by proposing a unified data augmentation framework with two novel operations on environment subgraphs to rationalize GNN prediction. We define the environment subgraph as the remaining subgraph after rationale identification and separation. The framework efficiently performs rationale–environment separation in therepresentation spacefor a node’s neighborhood graph or a graph’s complete structure to avoid the high complexity of explicit graph decoding and encoding. We conduct experiments on 17 datasets spanning node classification, graph classification, and graph regression. Results demonstrate that our framework is effective and efficient in rationalizing and enhancing GNNs for different levels of tasks on graphs.  more » « less
Award ID(s):
2102592 2332270 2146761
PAR ID:
10519861
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Knowledge Discovery from Data
Volume:
18
Issue:
4
ISSN:
1556-4681
Page Range / eLocation ID:
1 to 23
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Deep learning methods for graphs achieve remarkable performance on many node-level and graph-level prediction tasks. However, despite the proliferation of the methods and their success, prevailing Graph Neural Networks (GNNs) neglect subgraphs, rendering subgraph prediction tasks challenging to tackle in many impactful applications. Further, subgraph prediction tasks present several unique challenges: subgraphs can have non-trivial internal topology, but also carry a notion of position and external connectivity information relative to the underlying graph in which they exist. Here, we introduce SubGNN, a subgraph neural network to learn disentangled subgraph representations. We propose a novel subgraph routing mechanism that propagates neural messages between the subgraph's components and randomly sampled anchor patches from the underlying graph, yielding highly accurate subgraph representations. SubGNN specifies three channels, each designed to capture a distinct aspect of subgraph topology, and we provide empirical evidence that the channels encode their intended properties. We design a series of new synthetic and real-world subgraph datasets. Empirical results for subgraph classification on eight datasets show that SubGNN achieves considerable performance gains, outperforming strong baseline methods, including node-level and graph-level GNNs, by 19.8% over the strongest baseline. SubGNN performs exceptionally well on challenging biomedical datasets where subgraphs have complex topology and even comprise multiple disconnected components. 
    more » « less
  2. Graph Neural Networks (GNNs) have been widely deployed in various real-world applications. However, most GNNs are black-box models that lack explanations. One strategy to explain GNNs is through counterfactual explanation, which aims to find minimum perturbations on input graphs that change the GNN predictions. Existing works on GNN counterfactual explanations primarily concentrate on the local-level perspective (i.e., generating counterfactuals for each individual graph), which suffers from information overload and lacks insights into the broader cross-graph relationships. To address such issues, we propose GlobalGCE, a novel global-level graph counterfactual explanation method. GlobalGCE aims to identify a collection of subgraph mapping rules as counterfactual explanations for the target GNN. According to these rules, substituting certain significant subgraphs with their counterfactual subgraphs will change the GNN prediction to the desired class for most graphs (i.e., maximum coverage). Methodologically, we design a significant subgraph generator and a counterfactual subgraph autoencoder in our GlobalGCE, where the subgraphs and the rules can be effectively generated. Extensive experiments demonstrate the superiority of our GlobalGCE compared to existing baselines. 
    more » « less
  3. Graph Neural Networks (GNNs) are a popular machine learning framework for solving various graph processing applications. This framework exploits both the graph topology and the feature vectors of the nodes. One of the important applications of GNN is in the semi-supervised node classification task. The accuracy of the node classification using GNN depends on (i) the number and (ii) the choice of the training nodes. In this article, we demonstrate that increasing the training nodes by selecting nodes from the same class that are spread out across non-contiguous subgraphs, can significantly improve the accuracy. We accomplish this by presenting a novel input intervention technique that can be used in conjunction with different GNN classification methods to increase the non-contiguous training nodes and, thereby, improve the accuracy. We also present an output intervention technique to identify misclassified nodes and relabel them with their potentially correct labels. We demonstrate on real-world networks that our proposed methods, both individually and collectively, significantly improve the accuracy in comparison to the baseline GNN algorithms. Both our methods are agnostic. Apart from the initial set of training nodes generated by the baseline GNN methods, our techniques do not need any other extra knowledge about the classes of the nodes. Thus, our methods are modular and can be used as pre-and post-processing steps with many of the currently available GNN methods to improve their accuracy. 
    more » « less
  4. Graph Neural Networks (GNNs) have demonstrated a great potential in a variety of graph-based applications, such as recommender systems, drug discovery, and object recognition. Nevertheless, resource efficient GNN learning is a rarely explored topic despite its many benefits for edge computing and Internet of Things (IoT) applications. To improve this state of affairs, this work proposes efficient subgraph-level training via resource aware graph partitioning (SUGAR). SUGAR first partitions the initial graph into a set of disjoint subgraphs and then performs local training at the subgraph-level. We provide a theoretical analysis and conduct extensive experiments on five graph benchmarks to verify its efficacy in practice. Our results across five different hardware platforms demonstrate great runtime speedup and memory reduction of SUGAR on large-scale graphs. We believe SUGAR opens a new research direction towards developing GNN methods that are resource-efficient, hence suitable for IoT deployment. 
    more » « less
  5. Abstract—Graph Neural Networks (GNNs) have demonstrated a great potential in a variety of graph-based applications, such as recommender systems, drug discovery, and object recognition. Nevertheless, resource-efficient GNN learning is a rarely explored topic despite its many benefits for edge computing and Internet of Things (IoT) applications. To improve this state of affairs, this work proposes efficient subgraph-level training via resource-aware graph partitioning (SUGAR). SUGAR first partitions the initial graph into a set of disjoint subgraphs and then performs local training at the subgraph-level. We provide a theoretical analysis and conduct extensive experiments on five graph benchmarks to verify its efficacy in practice. Our results across five different hardware platforms demonstrate great runtime speedup and memory reduction of SUGAR on large-scale graphs. We believe SUGAR opens a new research direction towards developing GNN methods that are resource-efficient, hence suitable for IoT deployment. NOTE: This paper is currently under review. 
    more » « less