skip to main content

Title: WMGCN: Weighted Meta-Graph Based Graph Convolutional Networks for Representation Learning in Heterogeneous Networks
Network embedding has been an effective tool to analyze heterogeneous networks (HNs) by representing nodes in a low-dimensional space. Although many recent methods have been proposed for representation learning of HNs, there is still much room for improvement. Random walks based methods are currently popular methods to learn network embedding; however, they are random and limited by the length of sampled walks, and have difculty capturing network structural information. Some recent researches proposed using meta paths to express the sample relationship in HNs. Another popular graph learning model, the graph convolutional network (GCN) is known to be capable of better exploitation of network topology, but the current design of GCN is intended for homogenous networks. This paper proposes a novel combination of meta-graph and graph convolution, the meta-graph based graph convolutional networks (MGCN). To fully capture the complex long semantic information, MGCN utilizes different meta-graphs in HNs. As different meta-graphs express different semantic relationships, MGCN learns the weights of different meta-graphs to make up for the loss of semantics when applying GCN. In addition, we improve the current convolution design by adding node self-signicance. To validate our model in learning feature representation, we present comprehensive experiments on four real-world datasets and two representation tasks: classication and link prediction. WMGCN's representations can improve accuracy scores by up to around 10% in comparison to other popular representation learning models. What's more, WMGCN'feature learning outperforms other popular baselines. The experimental results clearly show our model is superior over other state-of-the-art representation learning algorithms.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE access
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Demeniconi, C. ; Davidson, I (Ed.)
    Many irregular domains such as social networks, financial transactions, neuron connections, and natural language constructs are represented using graph structures. In recent years, a variety of graph neural networks (GNNs) have been successfully applied for representation learning and prediction on such graphs. In many of the real-world applications, the underlying graph changes over time, however, most of the existing GNNs are inadequate for handling such dynamic graphs. In this paper we propose a novel technique for learning embeddings of dynamic graphs using a tensor algebra framework. Our method extends the popular graph convolutional network (GCN) for learning representations of dynamic graphs using the recently proposed tensor M-product technique. Theoretical results presented establish a connection between the proposed tensor approach and spectral convolution of tensors. The proposed method TM-GCN is consistent with the Message Passing Neural Network (MPNN) framework, accounting for both spatial and temporal message passing. Numerical experiments on real-world datasets demonstrate the performance of the proposed method for edge classification and link prediction tasks on dynamic graphs. We also consider an application related to the COVID-19 pandemic, and show how our method can be used for early detection of infected individuals from contact tracing data. 
    more » « less
  2. Noise and inconsistency commonly exist in real-world information networks, due to the inherent error-prone nature of human or user privacy concerns. To date, tremendous efforts have been made to advance feature learning from networks, including the most recent graph convolutional networks (GCNs) or attention GCN, by integrating node content and topology structures. However, all existing methods consider networks as error-free sources and treat feature content in each node as independent and equally important to model node relations. Noisy node content, combined with sparse features, provides essential challenges for existing methods to be used in real-world noisy networks. In this article, we propose feature-based attention GCN (FA-GCN), a feature-attention graph convolution learning framework, to handle networks with noisy and sparse node content. To tackle noise and sparse content in each node, FA-GCN first employs a long short-term memory (LSTM) network to learn dense representation for each node feature. To model interactions between neighboring nodes, a feature-attention mechanism is introduced to allow neighboring nodes to learn and vary feature importance, with respect to their connections. By using a spectral-based graph convolution aggregation process, each node is allowed to concentrate more on the most determining neighborhood features aligned with the corresponding learning task. Experiments and validations, w.r.t. different noise levels, demonstrate that FA-GCN achieves better performance than the state-of-the-art methods in both noise-free and noisy network environments. 
    more » « less
  3. To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, arriving at a modular GCN design, using different propagation rules. Our modular design is capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is much more influential than width and deeper GCNs are more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs. 
    more » « less
  4. Graph neural networks (GNNs) have emerged as a powerful tool for modeling graph data due to their ability to learn a concise representation of the data by integrating the node attributes and link information in a principled fashion. However, despite their promise, there are several practical challenges that must be overcome to effectively use them for node classification problems. In particular, current approaches are vulnerable to different kinds of biases inherent in the graph data. First, if the class distribution is imbalanced, then the GNNs' loss function is biased towards classifying the majority class correctly rather than the minority class, which hurts the performance of the latter class. Second, due to homophily effect, the learned representation and subsequent downstream tasks may favor certain demographic groups over others when applied to social network data. To mitigate such biases, we propose a novel framework called Fairness-Aware Cost Sensitive Graph Convolutional Network (FACS-GCN) for classifying nodes in networks with skewed class distributions. Our approach combines a cost-sensitive exponential loss with an adversarial learning component to alleviate the ill-effects of both biases. The framework employs a stagewise additive modeling approach to ensure there is no significant loss in accuracy when imparting fairness into the GNN. Experimental results on 6 benchmark graph data demonstrate the effectiveness of FACS-GCN against comparable baseline methods in terms of promoting fairness while maintaining a high model accuracy on the majority of the datasets. 
    more » « less
  5. Abstract Background

    The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be effective for building more precise classification models. Most current multi-omics integrative models use either an early fusion in the form of concatenation or late fusion with a separate feature extractor for each omic, which are mainly based on deep neural networks. Due to the nature of biological systems, graphs are a better structural representation of bio-medical data. Although few graph neural network (GNN) based multi-omics integrative methods have been proposed, they suffer from three common disadvantages. One is most of them use only one type of connection, either inter-omics or intra-omic connection; second, they only consider one kind of GNN layer, either graph convolution network (GCN) or graph attention network (GAT); and third, most of these methods have not been tested on a more complex classification task, such as cancer molecular subtypes.


    In this study, we propose a novel end-to-end multi-omics GNN framework for accurate and robust cancer subtype classification. The proposed model utilizes multi-omics data in the form of heterogeneous multi-layer graphs, which combine both inter-omics and intra-omic connections from established biological knowledge. The proposed model incorporates learned graph features and global genome features for accurate classification. We tested the proposed model on the Cancer Genome Atlas (TCGA) Pan-cancer dataset and TCGA breast invasive carcinoma (BRCA) dataset for molecular subtype and cancer subtype classification, respectively. The proposed model shows superior performance compared to four current state-of-the-art baseline models in terms of accuracy, F1 score, precision, and recall. The comparative analysis of GAT-based models and GCN-based models reveals that GAT-based models are preferred for smaller graphs with less information and GCN-based models are preferred for larger graphs with extra information.

    more » « less