Graph neural networks (GNNs) are popular machine learning models for graphs with many applications across scientic domains. However, GNNs are considered black box models, and it is challenging to understand how the model makes predictions. Game theoric Shapley value approaches are popular explanation methods in other domains but are not wellstudied for graphs. Some studies have proposed Shapley value based GNN explanations, yet they have several limitations: they consider limited samples to approximate Shapley values; some mainly focus on small and large coalition sizes, and they are an order of magnitude slower than other explanation methods, making them inapplicable to even moderatesize graphs. In this work, we propose GNNShap, which provides explanations for edges since they provide more natural explanations for graphs and more negrained explanations. We overcome the limitations by sampling from all coalition sizes, parallelizing the sampling on GPUs, and speeding up model predictions by batching. GNNShap gives better delity scores and faster explanations than baselines on realworld datasets. The code is available at https://github.com/HipGraph/GNNShap.
more »
« less
GStarX: Explaining Graph Neural Networks with StructureAware Cooperative Games
Explaining machine learning models is an important and increasingly popular area of research interest. The Shapley value from game theory has been proposed as a prime approach to compute feature importance towards model predictions on images, text, tabular data, and recently graph neural networks (GNNs) on graphs. In this work, we revisit the appropriateness of the Shapley value for GNN explanation, where the task is to identify the most important subgraph and constituent nodes for GNN predictions. We claim that the Shapley value is a nonideal choice for graph data because it is by definition not structureaware. We propose a Graph Structureaware eXplanation (GStarX) method to leverage the critical graph structure information to improve the explanation. Specifically, we define a scoring function based on a new structureaware value from the cooperative game theory proposed by Hamiache and Navarro (HN). When used to score node importance, the HN value utilizes graph structures to attribute cooperation surplus between neighbor nodes, resembling message passing in GNNs, so that node importance scores reflect not only the node feature importance, but also the node structural roles. We demonstrate that GStarX produces qualitatively more intuitive explanations, and quantitatively improves explanation fidelity over strong baselines on chemical graph property prediction and text graph sentiment classification.
more »
« less
 Award ID(s):
 1937599
 NSFPAR ID:
 10464925
 Date Published:
 Journal Name:
 36th Conference on Neural Information Processing Systems (NeurIPS 2022)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs. GNNs combine node feature information with the graph structure by recursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models and explaining predictions made by GNNs remains unsolved. Here we propose GNNEXPLAINER, the first general, modelagnostic approach for providing interpretable explanations for predictions of any GNNbased model on any graphbased machine learning task. Given an instance, GNNEXPLAINER identifies a compact subgraph structure and a small subset of node features that have a crucial role in GNN’s prediction. Further, GNNEXPLAINER can generate consistent and concise explanations for an entire class of instances. We formulate GNNEXPLAINER as an optimization task that maximizes the mutual information between a GNN’s prediction and distribution of possible subgraph structures. Experiments on synthetic and realworld graphs show that our approach can identify important graph structures as well as node features, and outperforms alternative baseline approaches by up to 43.0% in explanation accuracy. GNNEXPLAINER provides a variety of benefits, from the ability to visualize semantically relevant structures to interpretability, to giving insights into errors of faulty GNNs.more » « less

null (Ed.)Message passing Graph Neural Networks (GNNs) provide a powerful modeling framework for relational data. However, the expressive power of existing GNNs is upperbounded by the 1WeisfeilerLehman (1WL) graph isomorphism test, which means GNNs that are not able to predict node clustering coefficients and shortest path distances, and cannot differentiate between different d regular graphs. Here we develop a class of message passing GNNs, named Identityaware Graph Neural Networks (IDGNNs), with greater expressive power than the 1WL test. IDGNN offers a minimal but powerful solution to limitations of existing GNNs. IDGNN extends existing GNN architectures by inductively considering nodes’ identities during message passing. To embed a given node, IDGNN first extracts the ego network centered at the node, then conducts rounds of heterogeneous message passing, where different sets of parameters are applied to the center node than to other surrounding nodes in the ego network. We further propose a simplified but faster version of IDGNN that injects node identity information as augmented node features. Altogether, both versions of ID GNN represent general extensions of message passing GNNs, where experiments show that transforming existing GNNs to IDGNNs yields on average 40% accuracy improvement on challenging node, edge, and graph property prediction tasks; 3% accuracy improvement on node and graph classification benchmarks; and 15% ROC AUC improvement on realworld link prediction tasks. Additionally, IDGNNs demonstrate improved or comparable performance over other taskspecific graph networks.more » « less

null (Ed.)Message passing Graph Neural Networks (GNNs) provide a powerful modeling framework for relational data. However, the expressive power of existing GNNs is upperbounded by the 1WeisfeilerLehman (1WL) graph isomorphism test, which means GNNs that are not able to predict node clustering coefficients and shortest path distances, and cannot differentiate between different dregular graphs. Here we develop a class of message passing GNNs, named Identityaware Graph Neural Networks (IDGNNs), with greater expressive power than the 1WL test. IDGNN offers a minimal but powerful solution to limitations of existing GNNs. IDGNN extends existing GNN architectures by inductively considering nodes’ identities during message passing. To embed a given node, IDGNN first extracts the ego network centered at the node, then conducts rounds of heterogeneous message passing, where different sets of parameters are applied to the center node than to other surrounding nodes in the ego network. We further propose a simplified but faster version of IDGNN that injects node identity information as augmented node features. Altogether, both versions of IDGNN represent general extensions of message passing GNNs, where experiments show that transforming existing GNNs to IDGNNs yields on average 40% accuracy improvement on challenging node, edge, and graph property prediction tasks; 3% accuracy improvement on node and graph classification benchmarks; and 15% ROC AUC improvement on realworld link prediction tasks. Additionally, IDGNNs demonstrate improved or comparable performance over other taskspecific graph networks.more » « less

Graph Neural Networks (GNNs) have shown satisfying performance in various graph analytical problems. Hence, they have become the de facto solution in a variety of decisionmaking scenarios. However, GNNs could yield biased results against certain demographic subgroups. Some recent works have empirically shown that the biased structure of the input network is a significant source of bias for GNNs. Nevertheless, no studies have systematically scrutinized which part of the input network structure leads to biased predictions for any given node. The low transparency on how the structure of the input network influences the bias in GNN outcome largely limits the safe adoption of GNNs in various decisioncritical scenarios. In this paper, we study a novel research problem of structural explanation of bias in GNNs. Specifically, we propose a novel posthoc explanation framework to identify two edge sets that can maximally account for the exhibited bias and maximally contribute to the fairness level of the GNN prediction for any given node, respectively. Such explanations not only provide a comprehensive understanding of bias/fairness of GNN predictions but also have practical significance in building an effective yet fair GNN model. Extensive experiments on realworld datasets validate the effectiveness of the proposed framework towards delivering effective structural explanations for the bias of GNNs. Opensource code can be found at https://github.com/yushundong/REFEREE.more » « less