Network embedding has become the cornerstone of a variety of mining tasks, such as classification, link prediction, clustering, anomaly detection and many more, thanks to its superior ability to encode the intrinsic network characteristics in a compact low-dimensional space. Most of the existing methods focus on a single network and/or a single resolution, which generate embeddings of different network objects (node/subgraph/network) from different networks separately. A fundamental limitation with such methods is that the intrinsic relationship across different networks (e.g., two networks share same or similar subgraphs) and that across different resolutions (e.g., the node-subgraph membership) are ignored, resulting in disparate embeddings. Consequentially, it leads to sub-optimal performance or even becomes inapplicable for some downstream mining tasks (e.g., role classification, network alignment. etc.). In this paper, we propose a unified framework MrMine to learn the representations of objects from multiple networks at three complementary resolutions (i.e., network, subgraph and node) simultaneously. The key idea is to construct the cross-resolution cross-network context for each object. The proposed method bears two distinctive features. First, it enables and/or boosts various multi-network downstream mining tasks by having embeddings at different resolutions from different networks in the same embedding space. Second, Our method is efficient and scalable, with a O(nlog(n)) time complexity for the base algorithm and a linear time complexity w.r.t. the number of nodes and edges of input networks for the accelerated version. Extensive experiments on real-world data show that our methods (1) are able to enable and enhance a variety of multi-network mining tasks, and (2) scale up to million-node networks.
more »
« less
ADMIRING: Adversarial Multi-network Mining
Multi-sourced networks naturally appear in many application domains, ranging from bioinformatics, social networks, neuroscience to management. Although state-of-the-art offers rich models and algorithms to find various patterns when input networks are given, it has largely remained nascent on how vulnerable the mining results are due to the adversarial attacks. In this paper, we address the problem of attacking multi-network mining through the way of deliberately perturbing the networks to alter the mining results. The key idea of the proposed method (ADMIRING) is effective influence functions on the Sylvester equation defined over the input networks, which plays a central and unifying role in various multi-network mining tasks. The proposed algorithms bear two main advantages, including (1) effectiveness, being able to accurately quantify the rate of change of the mining results in response to attacks; and (2) generality, being applicable to a variety of multi-network mining tasks ( e.g., graph kernel, network alignment, cross-network node similarity) with different attacking strategies (e.g., edge/node removal, attribute alteration).
more »
« less
- PAR ID:
- 10159293
- Date Published:
- Journal Name:
- ICDM
- Page Range / eLocation ID:
- 1522 to 1527
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Network embedding, which learns the low-dimensional representations of nodes, has gained significant research attention. Despite its superior empirical success, often measured by the prediction performance of downstream tasks (e.g., multi-label classification), it is unclear why a given embedding algorithm outputs the specific node representations, and how the resulting node representations relate to the structure of the input network. In this paper, we propose to discern the edge influence as the first step towards understanding skip-gram basd network embedding methods. For this purpose, we propose an auditing framework NEAR, whose key part includes two algorithms (NEAR-ADD and NEAR-DEL) to effectively and efficiently quantify the influence of each edge. Based on the algorithms, we further identify high-influential edges by exploiting the linkage between edge influence and the network structure. Experimental results demonstrate that the proposed algorithms (NEAR-ADD and NEAR-DEL) are significantly faster (up to 2, 000×) than straightforward methods with little quality loss. Moreover, the proposed framework can efficiently identify the most influential edges for network embedding in the context of downstream prediction task and adversarial attacking.more » « less
-
null (Ed.)Network embedding has demonstrated effective empirical performance for various network mining tasks such as node classification, link prediction, clustering, and anomaly detection. However, most of these algorithms focus on the single-view network scenario. From a real-world perspective, one individual node can have different connectivity patterns in different networks. For example, one user can have different relationships on Twitter, Facebook, and LinkedIn due to varying user behaviors on different platforms. In this case, jointly considering the structural information from multiple platforms (i.e., multiple views) can potentially lead to more comprehensive node representations, and eliminate noises and bias from a single view. In this paper, we propose a view-adversarial framework to generate comprehensive and robust multi-view network representations named VANE, which is based on two adversarial games. The first adversarial game enhances the comprehensiveness of the node representation by discriminating the view information which is obtained from the subgraph induced by neighbors of that node. The second adversarial game improves the robustness of the node representation with the challenging of fake node representations from the generative adversarial net. We conduct extensive experiments on downstream tasks with real-world multi-view networks, which shows that our proposed VANE framework significantly outperforms other baseline methods.more » « less
-
Network mining plays a pivotal role in many high-impact application domains, including information retrieval, healthcare, social network analysis, security and recommender systems. State-of-the-art offers a wealth of sophisticated network mining algorithms, many of which have been widely adopted in real-world with superior empirical performance. Nonetheless, they often lack effective and efficient ways to characterize how the results of a given mining task relate to the underlying network structure. In this paper, we introduce network derivative mining problem. Given the input network and a specific mining algorithm, network derivative mining finds a derivative network whose edges measure the influence of the corresponding edges of the input network on the mining results. We envision that network derivative mining could be beneficial in a variety of scenarios, ranging from explainable network mining, adversarial network mining, sensitivity analysis on network structure, active learning, learning with side information to counterfactual learning on networks. We propose a generic framework for network derivative mining from the optimization perspective and provide various instantiations for three classic network mining tasks, including ranking, clustering, and matrix completion. For each mining task, we develop effective algorithm for constructing the derivative network based on influence function analysis, with numerous optimizations to ensure a linear complexity in both time and space. Extensive experimental evaluation on real-world datasets demonstrates the efficacy of the proposed framework and algorithms.more » « less
-
Network alignment, which aims to find the node correspondence across multiple networks, is a fundamental task in many areas, ranging from social network analysis to adversarial activity detection. The state-of-the-art in the data mining community often view the node correspondence as a probabilistic cross-network node similarity, and thus inevitably introduce an O(n2) lower bound on the computational complexity. Moreover, they might ignore the rich patterns (e.g., clusters) accompanying the real networks. In this paper, we propose a multilevel network alignment algorithm (Moana) which consists of three key steps. It first efficiently coarsens the input networks into their structured representations, and then aligns the coarsest representations of the input networks, followed by the interpolations to obtain the alignment at multiple levels including the node level at the finest granularity. The proposed coarsen-align-interpolate method bears two key advantages. First, it overcomes the O(n2) lower bound, achieving a linear complexity. Second, it helps reveal the alignment between rich patterns of the input networks at multiple levels (e.g., node, clusters, super-clusters, etc.). Extensive experimental evaluations demonstrate the efficacy of the proposed algorithm on both the node-level alignment and the alignment among rich patterns (e.g., clusters) at different granularities.more » « less