Data from many real-world applications can be naturally represented by multi-view networks where the different views encode different types of relationships (e.g., friendship, shared interests in music, etc.) between real-world individuals or entities. There is an urgent need for methods to obtain low-dimensional, information preserving and typically nonlinear embeddings of such multi-view networks. However, most of the work on multi-view learning focuses on data that lack a network structure, and most of the work on network embeddings has focused primarily on single-view networks. Against this background, we consider the multi-view network representation learning problem, i.e., the problem of constructing low-dimensional information preserving embeddings of multi-view networks. Specifically, we investigate a novel Generative Adversarial Network (GAN) framework for Multi-View Network Embedding, namely MEGAN, aimed at preserving the information from the individual network views, while accounting for connectivity across (and hence complementarity of and correlations between) different views. The results of our experiments on two real-world multi-view data sets show that the embeddings obtained using MEGAN outperform the state-of-the-art methods on node classification, link prediction and visualization tasks.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
- Page Range or eLocation-ID:
- Sponsoring Org:
- National Science Foundation
More Like this
Network embedding has demonstrated effective empirical performance for various network mining tasks such as node classification, link prediction, clustering, and anomaly detection. However, most of these algorithms focus on the single-view network scenario. From a real-world perspective, one individual node can have different connectivity patterns in different networks. For example, one user can have different relationships on Twitter, Facebook, and LinkedIn due to varying user behaviors on different platforms. In this case, jointly considering the structural information from multiple platforms (i.e., multiple views) can potentially lead to more comprehensive node representations, and eliminate noises and bias from a single view. In this paper, we propose a view-adversarial framework to generate comprehensive and robust multi-view network representations named VANE, which is based on two adversarial games. The first adversarial game enhances the comprehensiveness of the node representation by discriminating the view information which is obtained from the subgraph induced by neighbors of that node. The second adversarial game improves the robustness of the node representation with the challenging of fake node representations from the generative adversarial net. We conduct extensive experiments on downstream tasks with real-world multi-view networks, which shows that our proposed VANE framework significantly outperforms other baseline methods.
As heterogeneous networks have become increasingly ubiquitous, Heterogeneous Information Network (HIN) embedding, aiming to project nodes into a low-dimensional space while preserving the heterogeneous structure, has drawn increasing attention in recent years. Many of the existing HIN embedding methods adopt meta-path guided random walk to retain both the semantics and structural correlations between different types of nodes. However, the selection of meta-paths is still an open problem, which either depends on domain knowledge or is learned from label information. As a uniform blueprint of HIN, the network schema comprehensively embraces the high-order structure and contains rich semantics. In this paper, we make the first attempt to study network schema preserving HIN embedding, and propose a novel model named NSHE. In NSHE, a network schema sampling method is first proposed to generate sub-graphs (i.e., schema instances), and then multi-task learning task is built to preserve the heterogeneous structure of each schema instance. Besides preserving pairwise structure information, NSHE is able to retain high-order structure (i.e., network schema). Extensive experiments on three real-world datasets demonstrate that our proposed model NSHE significantly outperforms the state-of-the-art methods.
Network embedding has become the cornerstone of a variety of mining tasks, such as classification, link prediction, clustering, anomaly detection and many more, thanks to its superior ability to encode the intrinsic network characteristics in a compact low-dimensional space. Most of the existing methods focus on a single network and/or a single resolution, which generate embeddings of different network objects (node/subgraph/network) from different networks separately. A fundamental limitation with such methods is that the intrinsic relationship across different networks (e.g., two networks share same or similar subgraphs) and that across different resolutions (e.g., the node-subgraph membership) are ignored, resulting in disparate embeddings. Consequentially, it leads to sub-optimal performance or even becomes inapplicable for some downstream mining tasks (e.g., role classification, network alignment. etc.). In this paper, we propose a unified framework MrMine to learn the representations of objects from multiple networks at three complementary resolutions (i.e., network, subgraph and node) simultaneously. The key idea is to construct the cross-resolution cross-network context for each object. The proposed method bears two distinctive features. First, it enables and/or boosts various multi-network downstream mining tasks by having embeddings at different resolutions from different networks in the same embedding space. Second, Our method is efficientmore »
Multi-label classification is a challenging structured prediction task in which a set of output class labels are predicted for each input. Real-world datasets often have natural or latent taxonomic relationships between labels, making it desirable for models to employ label representations capable of capturing such taxonomies. Most existing multi-label classification methods do not do so, resulting in label predictions that are inconsistent with the taxonomic constraints, thus failing to accurately represent the fundamentals of problem setting. In this work, we introduce the multi-label box model (MBM), a multi-label classification method that combines the encoding power of neural networks with the inductive bias and probabilistic semantics of box embeddings (Vilnis, et al 2018). Box embeddings can be understood as trainable Venn-diagrams based on hyper-rectangles. Representing labels by boxes rather than vectors, MBM is able to capture taxonomic relations among labels. Furthermore, since box embeddings allow these relations to be learned by stochastic gradient descent from data, and to be read as calibrated conditional probabilities, our model is endowed with a high degree of interpretability. This interpretability also facilitates the injection of partial information about label-label relationships into model training, to further improve its consistency. We provide theoretical grounding for our methodmore »