Graph clustering is a fundamental problem in social network analysis, the goal of which is to group vertices of a graph into a series of densely knitted clusters with each cluster well separated from all the others. Classical graph clustering methods take advantage of the graph topology to model and quantify vertex proximity. With the proliferation of rich graph contents, such as user profiles in social networks, and gene annotations in protein interaction networks, it is essential to consider both the structure and content information of graphs for high-quality graph clustering. In this paper, we propose a graph embedding approach to clustering content-enriched graphs. The key idea is to embed each vertex of a graph into a continuous vector space where the localized structural and attributive information of vertices can be encoded in a unified, latent representation. Specifically, we quantify vertex-wise attribute proximity into edge weights, and employ truncated, attribute-aware random walks to learn the latent representations for vertices. We evaluate our attribute-aware graph embedding method in real-world attributed graphs, and the results demonstrate its effectiveness in comparison with state-of-the-art algorithms.
more »
« less
Representing Higher-Order Networks with Spectral Moments
The spectral properties of traditional (dyadic) graphs, where an edge connects exactly two vertices, are widely studied in different applications. These spectral properties are closely connected to the structural properties of dyadic graphs. We generalize such connections and characterize higher-order networks by their spectral information. We first split the higher-order graphs by their “edge orders” into several uniform hypergraphs. For each uniform hypergraph, we extract the corresponding spectral information from the transition matrices of carefully designed random walks. From each spectrum, we compute the first few spectral moments and use all such spectral moments across different “edge orders” as the higher-order graph representation. We show that these moments not only clearly indicate the return probabilities of random walks but are also closely related to various higher-order network properties such as degree distribution and clustering coefficient. Extensive experiments show the utility of this new representation in various settings. For instance, graph classification on higher-order graphs shows that this representation significantly outperforms other techniques.
more »
« less
- Award ID(s):
- 1942929
- PAR ID:
- 10639025
- Publisher / Repository:
- Springer Nature Singapore
- Date Published:
- Page Range / eLocation ID:
- 211 to 223
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, arriving at a modular GCN design, using different propagation rules. Our modular design is capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is much more influential than width and deeper GCNs are more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.more » « less
-
Higher order random walks (HD-walks) on high dimensional expanders (HDX) have seen an incredible amount of study and application since their introduction by Kaufman and Mass (ITCS 2016), yet their broader combinatorial and spectral properties remain poorly understood. We develop a combinatorial characterization of the spectral structure of HD-walks on two-sided local-spectral expanders (Dinur and Kaufman FOCS 2017), which offer a broad generalization of the well-studied Johnson and Grassmann graphs. Our characterization, which shows that the spectra of HD-walks lie tightly concentrated in a few combinatorially structured strips, leads to novel structural theorems such as a tight ℓ2-characterization of edge-expansion, as well as to a new understanding of local-to-global graph algorithms on HDX. Towards the latter, we introduce a novel spectral complexity measure called Stripped Threshold Rank, and show how it can replace the (much larger) threshold rank as a parameter controlling the performance of algorithms on structured objects. Combined with a sum-of-squares proof for the former ℓ2-characterization, we give a concrete application of this framework to algorithms for unique games on HD-walks, where in many cases we improve the state of the art (Barak, Raghavendra, and Steurer FOCS 2011, and Arora, Barak, and Steurer JACM 2015) from nearly-exponential to polynomial time (e.g. for sparsifications of Johnson graphs or of slices of the q-ary hypercube). Our characterization of expansion also holds an interesting connection to hardness of approximation, where an ℓ∞-variant for the Grassmann graphs was recently used to resolve the 2-2 Games Conjecture (Khot, Minzer, and Safra FOCS 2018). We give a reduction from a related ℓ∞-variant to our ℓ2-characterization, but it loses factors in the regime of interest for hardness where the gap between ℓ2 and ℓ∞ structure is large. Nevertheless, our results open the door for further work on the use of HDX in hardness of approximation and their general relation to unique games.more » « less
-
Network embedding has been an effective tool to analyze heterogeneous networks (HNs) by representing nodes in a low-dimensional space. Although many recent methods have been proposed for representation learning of HNs, there is still much room for improvement. Random walks based methods are currently popular methods to learn network embedding; however, they are random and limited by the length of sampled walks, and have difculty capturing network structural information. Some recent researches proposed using meta paths to express the sample relationship in HNs. Another popular graph learning model, the graph convolutional network (GCN) is known to be capable of better exploitation of network topology, but the current design of GCN is intended for homogenous networks. This paper proposes a novel combination of meta-graph and graph convolution, the meta-graph based graph convolutional networks (MGCN). To fully capture the complex long semantic information, MGCN utilizes different meta-graphs in HNs. As different meta-graphs express different semantic relationships, MGCN learns the weights of different meta-graphs to make up for the loss of semantics when applying GCN. In addition, we improve the current convolution design by adding node self-signicance. To validate our model in learning feature representation, we present comprehensive experiments on four real-world datasets and two representation tasks: classication and link prediction. WMGCN's representations can improve accuracy scores by up to around 10% in comparison to other popular representation learning models. What's more, WMGCN'feature learning outperforms other popular baselines. The experimental results clearly show our model is superior over other state-of-the-art representation learning algorithms.more » « less
-
null (Ed.)Although graph convolutional networks (GCNs) that extend the convolution operation from images to graphs have led to competitive performance, the existing GCNs are still difficult to handle a variety of applications, especially cheminformatics problems. Recently multiple GCNs are applied to chemical compound structures which are represented by the hydrogen-depleted molecular graphs of different size. GCNs built for a binary adjacency matrix that reflects the connectivity among nodes in a graph do not account for the edge consistency in multiple molecular graphs, that is, chemical bonds (edges) in different molecular graphs can be similar due to the similar enthalpy and interatomic distance. In this paper, we propose a variant of GCN where a molecular graph is first decomposed into multiple views of the graph, each comprising a specific type of edges. In each view, an edge consistency constraint is enforced so that similar edges in different graphs can receive similar attention weights when passing information. Similarly to prior work, we prove that in each layer, our method corresponds to a spectral filter derived by the first order Chebyshev approximation of graph Laplacian. Extensive experiments demonstrate the substantial advantages of the proposed technique in quantitative structure-activity relationship prediction.more » « less
An official website of the United States government

