skip to main content

Title: Discerning Edge Influence for Network Embedding
Network embedding, which learns the low-dimensional representations of nodes, has gained significant research attention. Despite its superior empirical success, often measured by the prediction performance of downstream tasks (e.g., multi-label classification), it is unclear why a given embedding algorithm outputs the specific node representations, and how the resulting node representations relate to the structure of the input network. In this paper, we propose to discern the edge influence as the first step towards understanding skip-gram basd network embedding methods. For this purpose, we propose an auditing framework NEAR, whose key part includes two algorithms (NEAR-ADD and NEAR-DEL) to effectively and efficiently quantify the influence of each edge. Based on the algorithms, we further identify high-influential edges by exploiting the linkage between edge influence and the network structure. Experimental results demonstrate that the proposed algorithms (NEAR-ADD and NEAR-DEL) are significantly faster (up to 2, 000×) than straightforward methods with little quality loss. Moreover, the proposed framework can efficiently identify the most influential edges for network embedding in the context of downstream prediction task and adversarial attacking.
Authors:
; ; ; ;
Award ID(s):
1947135 1651203 1715385 2003924
Publication Date:
NSF-PAR ID:
10159174
Journal Name:
CIKM
Page Range or eLocation-ID:
429 to 438
Sponsoring Org:
National Science Foundation
More Like this
  1. Network embedding aims to automatically learn the node representations in networks. The basic idea of network embedding is to first construct a network to describe the neighborhood context for each node, and then learn the node representations by designing an objective function to preserve certain properties of the constructed context network. The vast majority of the existing methods, explicitly or implicitly, follow a pointwise design principle. That is, the objective can be decomposed into the summation of the certain goodness function over each individual edge of the context network. In this paper, we propose to go beyond such pointwise approaches, and introduce the ranking-oriented design principle for network embedding. The key idea is to decompose the overall objective function into the summation of a goodness function over a set of edges to collectively preserve their relative rankings on the context network. We instantiate the ranking-oriented design principle by two new network embedding algorithms, including a pairwise network embedding method PaWine which optimizes the relative weights of edge pairs, and a listwise method LiWine which optimizes the relative weights of edge lists. Both proposed algorithms bear a linear time complexity, making themselves scalable to large networks. We conduct extensive experimental evaluationsmore »on five real datasets with a variety of downstream learning tasks, which demonstrate that the proposed approaches consistently outperform the existing methods.« less
  2. Network embedding has become the cornerstone of a variety of mining tasks, such as classification, link prediction, clustering, anomaly detection and many more, thanks to its superior ability to encode the intrinsic network characteristics in a compact low-dimensional space. Most of the existing methods focus on a single network and/or a single resolution, which generate embeddings of different network objects (node/subgraph/network) from different networks separately. A fundamental limitation with such methods is that the intrinsic relationship across different networks (e.g., two networks share same or similar subgraphs) and that across different resolutions (e.g., the node-subgraph membership) are ignored, resulting in disparate embeddings. Consequentially, it leads to sub-optimal performance or even becomes inapplicable for some downstream mining tasks (e.g., role classification, network alignment. etc.). In this paper, we propose a unified framework MrMine to learn the representations of objects from multiple networks at three complementary resolutions (i.e., network, subgraph and node) simultaneously. The key idea is to construct the cross-resolution cross-network context for each object. The proposed method bears two distinctive features. First, it enables and/or boosts various multi-network downstream mining tasks by having embeddings at different resolutions from different networks in the same embedding space. Second, Our method is efficientmore »and scalable, with a O(nlog(n)) time complexity for the base algorithm and a linear time complexity w.r.t. the number of nodes and edges of input networks for the accelerated version. Extensive experiments on real-world data show that our methods (1) are able to enable and enhance a variety of multi-network mining tasks, and (2) scale up to million-node networks.« less
  3. Network embedding has demonstrated effective empirical performance for various network mining tasks such as node classification, link prediction, clustering, and anomaly detection. However, most of these algorithms focus on the single-view network scenario. From a real-world perspective, one individual node can have different connectivity patterns in different networks. For example, one user can have different relationships on Twitter, Facebook, and LinkedIn due to varying user behaviors on different platforms. In this case, jointly considering the structural information from multiple platforms (i.e., multiple views) can potentially lead to more comprehensive node representations, and eliminate noises and bias from a single view. In this paper, we propose a view-adversarial framework to generate comprehensive and robust multi-view network representations named VANE, which is based on two adversarial games. The first adversarial game enhances the comprehensiveness of the node representation by discriminating the view information which is obtained from the subgraph induced by neighbors of that node. The second adversarial game improves the robustness of the node representation with the challenging of fake node representations from the generative adversarial net. We conduct extensive experiments on downstream tasks with real-world multi-view networks, which shows that our proposed VANE framework significantly outperforms other baseline methods.
  4. Attributed network embedding aims to learn lowdimensional vector representations for nodes in a network, where each node contains rich attributes/features describing node content. Because network topology structure and node attributes often exhibit high correlation, incorporating node attribute proximity into network embedding is beneficial for learning good vector representations. In reality, large-scale networks often have incomplete/missing node content or linkages, yet existing attributed network embedding algorithms all operate under the assumption that networks are complete. Thus, their performance is vulnerable to missing data and suffers from poor scalability. In this paper, we propose a Scalable Incomplete Network Embedding (SINE) algorithm for learning node representations from incomplete graphs. SINE formulates a probabilistic learning framework that separately models pairs of node-context and node-attribute relationships. Different from existing attributed network embedding algorithms, SINE provides greater flexibility to make the best of useful information and mitigate negative effects of missing information on representation learning. A stochastic gradient descent based online algorithm is derived to learn node representations, allowing SINE to scale up to large-scale networks with high learning efficiency. We evaluate the effectiveness and efficiency of SINE through extensive experiments on real-world networks. Experimental results confirm that SINE outperforms state-of-the-art baselines in various tasks, includingmore »node classification, node clustering, and link prediction, under settings with missing links and node attributes. SINE is also shown to be scalable and efficient on large-scale networks with millions of nodes/edges and high-dimensional node features.« less
  5. Learning the low-dimensional representations of graphs (i.e., network embedding) plays a critical role in network analysis and facilitates many downstream tasks. Recently graph convolutional networks (GCNs) have revolutionized the field of network embedding, and led to state-of-the-art performance in network analysis tasks such as link prediction and node classification. Nevertheless, most of the existing GCN-based network embedding methods are proposed for unsigned networks. However, in the real world, some of the networks are signed, where the links are annotated with different polarities, e.g., positive vs. negative. Since negative links may have different properties from the positive ones and can also significantly affect the quality of network embedding. Thus in this paper, we propose a novel network embedding framework SNEA to learn Signed Network Embedding via graph Attention. In particular, we propose a masked self-attentional layer, which leverages self-attention mechanism to estimate the importance coefficient for pair of nodes connected by different type of links during the embedding aggregation process. Then SNEA utilizes the masked self-attentional layers to aggregate more important information from neighboring nodes to generate the node embeddings based on balance theory. Experimental results demonstrate the effectiveness of the proposed framework through signed link prediction task on several real-worldmore »signed network datasets.« less