skip to main content

This content will become publicly available on December 14, 2023

Title: OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs
This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs) -- such as Graph Neural Networks (GNNs) -- to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove non-asymptotic bounds showing that link predictors based on permutation-equivariant (structural) node embeddings obtained by gMPNNs can converge to a random guess as test graphs get larger. We then propose a theoretically-sound gMPNN that outputs structural pairwise (2-node) embeddings and prove non-asymptotic bounds showing that, as test graphs grow, these embeddings converge to embeddings of a continuous function that retains its ability to predict links OOD. Empirical results on random graphs show agreement with our theoretical results.  more » « less
Award ID(s):
1943364 1918483
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in neural information processing systems
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Learning representations of sets of nodes in a graph is crucial for applications ranging from node-role discovery to link prediction and molecule classification. Graph Neural Networks (GNNs) have achieved great success in graph representation learning. However, expressive power of GNNs is limited by the 1-Weisfeiler-Lehman (WL) test and thus GNNs generate identical representations for graph substructures that may in fact be very different. More powerful GNNs, proposed recently by mimicking higher-order-WL tests, only focus on representing entire graphs and they are computationally inefficient as they cannot utilize sparsity of the underlying graph. Here we propose and mathematically analyze a general class of structure related features, termed Distance Encoding (DE). DE assists GNNs in representing any set of nodes, while providing strictly more expressive power than the 1-WL test. DE captures the distance between the node set whose representation is to be learned and each node in the graph. To capture the distance DE can apply various graph-distance measures such as shortest path distance or generalized PageRank scores. We propose two ways for GNNs to use DEs (1) as extra node features, and (2) as controllers of message aggregation in GNNs. Both approaches can utilize the sparse structure of the underlying graph, which leads to computational efficiency and scalability. We also prove that DE can distinguish node sets embedded in almost all regular graphs where traditional GNNs always fail. We evaluate DE on three tasks over six real networks: structural role prediction, link prediction, and triangle prediction. Results show that our models outperform GNNs without DE by up-to 15% in accuracy and AUROC. Furthermore, our models also significantly outperform other state-of-the-art methods especially designed for the above tasks. 
    more » « less
  2. Recent work shows that the expressive power of Graph Neural Networks (GNNs) in distinguishing non-isomorphic graphs is exactly the same as that of the Weisfeiler-Lehman (WL) graph test. In particular, they show that the WL test can be simulated by GNNs. However, those simulations involve neural networks for the “combine” function of size polynomial or even exponential in the number of graph nodes n, as well as feature vectors of length linear in n. We present an improved simulation of the WL test on GNNs with exponentially lower complexity. In particular, the neural network implementing the combine function in each node has only polylog(n) parameters, and the feature vectors exchanged by the nodes of GNN consists of only O(log n) bits. We also give logarithmic lower bounds for the feature vector length and the size of the neural networks, showing the (near)-optimality of our construction. 
    more » « less
  3. We study two popular ways to sketch the shortest path distances of an input graph. The first is distance preservers , which are sparse subgraphs that agree with the distances of the original graph on a given set of demand pairs. Prior work on distance preservers has exploited only a simple structural property of shortest paths, called consistency , stating that one can break shortest path ties such that no two paths intersect, split apart, and then intersect again later. We prove that consistency alone is not enough to understand distance preservers, by showing both a lower bound on the power of consistency and a new general upper bound that polynomially surpasses it. Specifically, our new upper bound is that any p demand pairs in an n -node undirected unweighted graph have a distance preserver on O( n 2/3 p 2/3 + np 1/3 edges. We leave a conjecture that the right bound is O ( n 2/3 p 2/3 + n ) or better. The second part of this paper leverages these distance preservers in a new construction of additive spanners , which are subgraphs that preserve all pairwise distances up to an additive error function. We give improved error bounds for spanners with relatively few edges; for example, we prove that all graphs have spanners on O(n) edges with + O ( n 3/7 + ε ) error. Our construction can be viewed as an extension of the popular path-buying framework to clusters of larger radii. 
    more » « less
  4. Chordal graphs are a widely studied graph class, with applications in several areas of computer science, including structural learning of Bayesian networks. Many problems that are hard on general graphs become solvable on chordal graphs. The random generation of instances of chordal graphs for testing these algorithms is often required. Nevertheless, there are only few known algorithms that generate random chordal graphs, and, as far as we know, none of them generate chordal graphs uniformly at random (where each chordal graph appears with equal probability). In this paper we propose a Markov chain Monte Carlo (MCMC) method to sample connected chordal graphs uniformly at random. Additionally, we propose a Markov chain that generates connected chordal graphs with a bounded treewidth uniformly at random. Bounding the treewidth parameter (which bounds the largest clique) has direct implications on the running time of various algorithms on chordal graphs. For each of the proposed Markov chains we prove that they are ergodic and therefore converge to the uniform distribution. Finally, as initial evidence that the Markov chains have the potential to mix rapidly, we prove that the chain on graphs with bounded treewidth mixes rapidly for trees (chordal graphs with treewidth bound of one). 
    more » « less
  5. Graph convolutional neural networks (GCNs) embed nodes in a graph into Euclidean space, which has been shown to incur a large distortion when embedding real-world graphs with scale-free or hierarchical structure. Hyperbolic geometry offers an exciting alternative, as it enables embeddings with much smaller distortion. However, extending GCNs to hyperbolic geometry presents several unique challenges because it is not clear how to define neural network operations, such as feature transformation and aggregation, in hyperbolic space. Furthermore, since input features are often Euclidean, it is unclear how to transform the features into hyperbolic embeddings with the right amount of curvature. Here we propose Hyperbolic Graph Convolutional Neural Network (HGCN), the first inductive hyperbolic GCN that leverages both the expressiveness of GCNs and hyperbolic geometry to learn inductive node representations for hierarchical and scale-free graphs. We derive GCNs operations in the hyperboloid model of hyperbolic space and map Euclidean input features to embeddings in hyperbolic spaces with different trainable curvature at each layer. Experiments demonstrate that HGCN learns embeddings that preserve hierarchical structure, and leads to improved performance when compared to Euclidean analogs, even with very low dimensional embeddings: compared to state-of-the-art GCNs, HGCN achieves an error reduction of up to 63.1% in ROC AUC for link prediction and of up to 47.5% in F1 score for node classification, also improving state-of-the art on the PubMed dataset. 
    more » « less