skip to main content

Title: ADMIRING: Adversarial Multi-network Mining
Multi-sourced networks naturally appear in many application domains, ranging from bioinformatics, social networks, neuroscience to management. Although state-of-the-art offers rich models and algorithms to find various patterns when input networks are given, it has largely remained nascent on how vulnerable the mining results are due to the adversarial attacks. In this paper, we address the problem of attacking multi-network mining through the way of deliberately perturbing the networks to alter the mining results. The key idea of the proposed method (ADMIRING) is effective influence functions on the Sylvester equation defined over the input networks, which plays a central and unifying role in various multi-network mining tasks. The proposed algorithms bear two main advantages, including (1) effectiveness, being able to accurately quantify the rate of change of the mining results in response to attacks; and (2) generality, being applicable to a variety of multi-network mining tasks ( e.g., graph kernel, network alignment, cross-network node similarity) with different attacking strategies (e.g., edge/node removal, attribute alteration).
Authors:
; ; ; ;
Award ID(s):
1947135 1651203 1715385 2003924
Publication Date:
NSF-PAR ID:
10159293
Journal Name:
ICDM
Page Range or eLocation-ID:
1522 to 1527
Sponsoring Org:
National Science Foundation
More Like this
  1. Network embedding has become the cornerstone of a variety of mining tasks, such as classification, link prediction, clustering, anomaly detection and many more, thanks to its superior ability to encode the intrinsic network characteristics in a compact low-dimensional space. Most of the existing methods focus on a single network and/or a single resolution, which generate embeddings of different network objects (node/subgraph/network) from different networks separately. A fundamental limitation with such methods is that the intrinsic relationship across different networks (e.g., two networks share same or similar subgraphs) and that across different resolutions (e.g., the node-subgraph membership) are ignored, resulting in disparate embeddings. Consequentially, it leads to sub-optimal performance or even becomes inapplicable for some downstream mining tasks (e.g., role classification, network alignment. etc.). In this paper, we propose a unified framework MrMine to learn the representations of objects from multiple networks at three complementary resolutions (i.e., network, subgraph and node) simultaneously. The key idea is to construct the cross-resolution cross-network context for each object. The proposed method bears two distinctive features. First, it enables and/or boosts various multi-network downstream mining tasks by having embeddings at different resolutions from different networks in the same embedding space. Second, Our method is efficientmore »and scalable, with a O(nlog(n)) time complexity for the base algorithm and a linear time complexity w.r.t. the number of nodes and edges of input networks for the accelerated version. Extensive experiments on real-world data show that our methods (1) are able to enable and enhance a variety of multi-network mining tasks, and (2) scale up to million-node networks.« less
  2. Network embedding, which learns the low-dimensional representations of nodes, has gained significant research attention. Despite its superior empirical success, often measured by the prediction performance of downstream tasks (e.g., multi-label classification), it is unclear why a given embedding algorithm outputs the specific node representations, and how the resulting node representations relate to the structure of the input network. In this paper, we propose to discern the edge influence as the first step towards understanding skip-gram basd network embedding methods. For this purpose, we propose an auditing framework NEAR, whose key part includes two algorithms (NEAR-ADD and NEAR-DEL) to effectively and efficiently quantify the influence of each edge. Based on the algorithms, we further identify high-influential edges by exploiting the linkage between edge influence and the network structure. Experimental results demonstrate that the proposed algorithms (NEAR-ADD and NEAR-DEL) are significantly faster (up to 2, 000×) than straightforward methods with little quality loss. Moreover, the proposed framework can efficiently identify the most influential edges for network embedding in the context of downstream prediction task and adversarial attacking.
  3. Network embedding has demonstrated effective empirical performance for various network mining tasks such as node classification, link prediction, clustering, and anomaly detection. However, most of these algorithms focus on the single-view network scenario. From a real-world perspective, one individual node can have different connectivity patterns in different networks. For example, one user can have different relationships on Twitter, Facebook, and LinkedIn due to varying user behaviors on different platforms. In this case, jointly considering the structural information from multiple platforms (i.e., multiple views) can potentially lead to more comprehensive node representations, and eliminate noises and bias from a single view. In this paper, we propose a view-adversarial framework to generate comprehensive and robust multi-view network representations named VANE, which is based on two adversarial games. The first adversarial game enhances the comprehensiveness of the node representation by discriminating the view information which is obtained from the subgraph induced by neighbors of that node. The second adversarial game improves the robustness of the node representation with the challenging of fake node representations from the generative adversarial net. We conduct extensive experiments on downstream tasks with real-world multi-view networks, which shows that our proposed VANE framework significantly outperforms other baseline methods.
  4. The past decades have witnessed the prosperity of graph mining, with a multitude of sophisticated models and algorithms designed for various mining tasks, such as ranking, classification, clustering and anomaly detection. Generally speaking, the vast majority of the existing works aim to answer the following question, that is, given a graph, what is the best way to mine it? In this paper, we introduce the graph sanitation problem, to an- swer an orthogonal question. That is, given a mining task and an initial graph, what is the best way to improve the initially provided graph? By learning a better graph as part of the input of the mining model, it is expected to benefit graph mining in a variety of settings, ranging from denoising, imputation to defense. We formulate the graph sanitation problem as a bilevel optimization problem, and fur- ther instantiate it by semi-supervised node classification, together with an effective solver named GaSoliNe. Extensive experimental results demonstrate that the proposed method is (1) broadly appli- cable with respect to various graph neural network models and flexible graph modification strategies, (2) effective in improving the node classification accuracy on both the original and contaminated graphs in various perturbation scenarios. Inmore »particular, it brings up to 25% performance improvement over the existing robust graph neural network methods.« less
  5. Network mining plays a pivotal role in many high-impact application domains, including information retrieval, healthcare, social network analysis, security and recommender systems. State-of-the-art offers a wealth of sophisticated network mining algorithms, many of which have been widely adopted in real-world with superior empirical performance. Nonetheless, they often lack effective and efficient ways to characterize how the results of a given mining task relate to the underlying network structure. In this paper, we introduce network derivative mining problem. Given the input network and a specific mining algorithm, network derivative mining finds a derivative network whose edges measure the influence of the corresponding edges of the input network on the mining results. We envision that network derivative mining could be beneficial in a variety of scenarios, ranging from explainable network mining, adversarial network mining, sensitivity analysis on network structure, active learning, learning with side information to counterfactual learning on networks. We propose a generic framework for network derivative mining from the optimization perspective and provide various instantiations for three classic network mining tasks, including ranking, clustering, and matrix completion. For each mining task, we develop effective algorithm for constructing the derivative network based on influence function analysis, with numerous optimizations to ensuremore »a linear complexity in both time and space. Extensive experimental evaluation on real-world datasets demonstrates the efficacy of the proposed framework and algorithms.« less