skip to main content

Title: BRIGHT: A Bridging Algorithm for Network Alignment
Multiple networks emerge in a wealth of high-impact applications. Network alignment, which aims to find the node correspondence across different networks, plays a fundamental role for many data mining tasks. Most of the existing methods can be divided into two categories: (1) consistency optimization based methods, which often explicitly assume the alignment to be consistent in terms of neighborhood topology and attribute across networks, and (2) network embedding based methods which learn low-dimensional node embedding vectors to infer alignment. In this paper, by analyzing certain methods of these two categories, we show that (1) the consistency optimization based methods are essentially specific random walk propagations from anchor links that might be restrictive; (2) the embedding based methods no longer explicitly assume alignment consistency but inevitably suffer from the space disparity issue. To overcome these two limitations, we bridge these methods and propose a novel family of network alignment algorithms BRIGHT to handle both non-attributed and attributed networks. Specifically, it constructs a space by random walk with restart (RWR) whose bases are one-hot encoding vectors of anchor nodes, followed by a shared linear layer. Our experiments on real-world networks show that the proposed family of algorithms BRIGHT outperform the state-of-the- arts more » for both non-attributed and attributed network alignment tasks. « less
Award ID(s):
1939725 1947135
Publication Date:
Journal Name:
Sponsoring Org:
National Science Foundation
More Like this
  1. Finding node associations across different networks is the cornerstone behind a wealth of high-impact data mining applications. Traditional approaches are often, explicitly or implicitly, built upon the linearity and/or consistency assumptions. On the other hand, the recent network embedding based methods promise a natural way to handle the non-linearity, yet they could suffer from the disparate node embedding space of different networks. In this paper, we address these limitations and tackle cross-network node associations from a new angle, i.e., cross-network transformation. We ask a generic question: Given two different networks, how can we transform one network to another? We propose an end-to-end model that learns a composition of nonlinear operations so that one network can be transformed to another in a hierarchical manner. The proposed model bears three distinctive advantages. First (composite transformation), it goes beyond the linearity/consistency assumptions and performs the cross-network transformation through a composition of nonlinear computations. Second (representation power), it can learn the transformation of both network structures and node attributes at different resolutions while identifying the cross-network node associations. Third (generality), it can be applied to various tasks, including network alignment, recommendation, cross-layer dependency inference. Extensive experiments on different tasks validate and verify the effectivenessmore »of the proposed model.« less
  2. Network alignment (NA) is a fundamental problem in many application domains – from social networks, through biology and communications, to neuroscience. The main objective is to identify common nodes and most similar connections across multiple networks (resp. graphs). Many of the existing efforts focus on efficient anchor node linkage by leveraging various features and optimizing network mapping functions with the pairwise similarity between anchor nodes. Despite the recent advances, there still exist two kinds of challenges: (1) entangled node embeddings, arising from the contradictory goals of NA: embedding proximal nodes in a closed form for representation in a single network vs. discriminating among them when mapping the nodes across networks; and (2) lack of interpretability about the node matching and alignment, essential for understanding prediction tasks. We propose dNAME (disentangled Network Alignment with Matching Explainability) – a novel solution for NA in heterogeneous networks settings, based on a matching technique that embeds nodes in a disentangled and faithful manner. The NA task is cast as an adversarial optimization problem which learns a proximity-preserving model locally around the anchor nodes, while still being discriminative. We also introduce a method to explain our semi-supervised model with the theory of robust statistics, bymore »tracing the importance of each anchor node and its explanations on the NA performance. This is extensible to many other NA methods, as it provides model interpretability. Experiments conducted on several public datasets show that dNAME outperforms the state-of-the-art methods in terms of both network alignment precision and node matching ranking.« less
  3. We present FastRP, a scalable and performant algorithm for learning distributed node representations in a graph. FastRP is over 4,000 times faster than state-of-the-art methods such as DeepWalk and node2vec, while achieving comparable or even better performance as evaluated on several real-world networks on various downstream tasks. We observe that most network embedding methods consist of two components: construct a node similarity matrix and then apply dimension reduction techniques to this matrix. We show that the success of these methods should be attributed to the proper construction of this similarity matrix, rather than the dimension reduction method employed. FastRP is proposed as a scalable algorithm for network embeddings. Two key features of FastRP are: 1) it explicitly constructs a node similarity matrix that captures transitive relationships in a graph and normalizes matrix entries based on node degrees; 2) it utilizes very sparse random projection, which is a scalable optimization-free method for dimension reduction. An extra benefit from combining these two design choices is that it allows the iterative computation of node embeddings so that the similarity matrix need not be explicitly constructed, which further speeds up FastRP. FastRP is also advantageous for its ease of implementation, parallelization and hyperparameter tuning.more »The source code is available at« less
  4. Network representation learning (NRL) is crucial in the area of graph learning. Recently, graph autoencoders and its variants have gained much attention and popularity among various types of node embedding approaches. Most existing graph autoencoder-based methods aim to minimize the reconstruction errors of the input network while not explicitly considering the semantic relatedness between nodes. In this paper, we propose a novel network embedding method which models the consistency across different views of networks. More specifically, we create a second view from the input network which captures the relation between nodes based on node content and enforce the latent representations from the two views to be consistent by incorporating a multiview adversarial regularization module. The experimental studies on benchmark datasets prove the effectiveness of this method, and demonstrate that our method compares favorably with the state-of-the-art algorithms on challenging tasks such as link prediction and node clustering. We also evaluate our method on a real-world application, i.e., 30-day unplanned ICU readmission prediction, and achieve promising results compared with several baseline methods.
  5. We introduce a model for ant trail formation, building upon previous work on biologically feasible local algorithms that plausibly describe how ants maintain trail networks. The model is a variant of a reinforced random walk on a directed graph, where ants lay pheromone on edges as they traverse them and the next edge to traverse is chosen based on the level of pheromone; this pheromone decays with time. There is a bidirectional flow of ants in the network: the forward flow proceeds along forward edges from source (e.g. the nest) to sink (e.g. a food source), and the backward flow in the opposite direction. Some fraction of ants are lost as they pass through each node (modeling the loss of ants due to exploration observed in the field). We initiate a theoretical study of this model. We note that ant navigation has inspired the field of ant colony optimization, heuristics that have been applied to several combinatorial optimization problems; however the algorithms developed there are considerably more complex and not constrained to being biologically feasible. We first consider the linear decision rule, where the flow divides itself among the next set of edges in proportion to their pheromone level. Here,more »we show that the process converges to the path with minimum leakage when the forward and backward flows do not change over time. On the other hand, when the forward and backward flows increase over time (caused by positive reinforcement from the discovery of a food source, for example), we show that the process converges to the shortest path. These results are for graphs consisting of two parallel paths (a case that has been investigated before in experiments). Through simulations, we show that these results hold for more general graphs drawn from various random graph models; proving this convergence in the general case is an interesting open problem. Further, to understand the behaviour of other decision rules beyond the linear rule, we consider a general family of decision rules. For this family, we show that there is no advantage of using a non-linear decision rule, if the goal is to find the shortest or the minimum leakage path. We also show that bidirectional flow is necessary for convergence to such paths. Our results provide a plausible explanation for field observations, and open up new avenues for further theoretical and experimental investigation.« less