skip to main content


Title: Learning Mixed Membership from Adjacency Graph Via Systematic Edge Query: Identifiability and Algorithm
Graph clustering is a core technique for network analysis problems, e.g., community detection. This work puts forth a node clustering approach for largely incomplete adjacency graphs. Under the considered scenario, instead of having access to the complete graph, only a small amount of queries about the graph edges can be made for node clustering. This task is well-motivated in many large-scale network analysis problems, where complete graph acquisition is prohibitively costly. Prior work tackles this problem under the setting that the nodes only admit single membership and the clusters are disjoint, yet multiple membership nodes and overlapping clusters often arise in practice. Existing approaches also rely on random edge query patterns and convex optimization-based formulations, which give rise to a number of implementation and scalability challenges. This work offers a framework that provably learns the mixed membership of nodes from overlapping clusters using limited edge information. Our method is equipped with a systematic edge query pattern, which is arguably easier to implement relative to the random counterparts in certain applications, e.g., field survey based graph analysis. A lightweight scalable algorithm is proposed, and its performance characterizations are presented. Numerical experiments are used to showcase the effectiveness of our method  more » « less
Award ID(s):
2007836
PAR ID:
10292992
Author(s) / Creator(s):
;
Date Published:
Journal Name:
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Page Range / eLocation ID:
5370 to 5374
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This work proposes a new unsupervised (or self-supervised) node representation learning method that aims to leverage the coarse-grain information that is available in most graphs. This extends previous attempts that only leverage fine-grain information (similarities within local neighborhoods) or global graph information (similarities across all nodes). Intuitively, the proposed method identifies nodes that belong to the same clusters and maximizes their mutual information. Thus, coarse-grain (cluster-level) similarities that are shared between nodes are preserved in their representations. The core components of the proposed method are (i) a jointly optimized clustering of nodes during learning and (ii) an Infomax objective term that preserves the mutual information among nodes of the same clusters. Our method is able to outperform competing state-of-art methods in various downstream tasks, such as node classification, link prediction, and node clustering. Experiments show that the average gain is between 0.2% and 6.1%, over the best competing approach, over all tasks. Our code is publicly available at: https://github.com/cmavro/Graph-InfoClust-GIC. 
    more » « less
  2. A source node updates its status as a point process and also forwards its updates to a network of observer nodes. Within the network of observers, these updates are forwarded as point processes from node to node. Each node wishes its knowledge of the source to be as timely as possible. In this network, timeliness is measured by a discrete form of age of information: each status change at the source is referred to as a version and the age at a node is how many versions out of date is its most recent update from the source. This work introduces a method for evaluating the average version age at each node in the network when nodes forward updates using a memoryless gossip protocol. This method is then demonstrated by version age analysis for a collection of simple networks. For gossip on a complete graph with symmetric updating rates, it is shown that each node has average age that grows as the logarithm of the network size. 
    more » « less
  3. Path-based solutions have been shown to be useful for various graph analysis tasks, such as link prediction and graph clustering. However, they are no longer adequate for handling complex and gigantic graphs. Recently, motif-based analysis has attracted a lot of attention. A motif, or a small graph with a few nodes, is often considered as a fundamental unit of a graph. Motif-based analysis captures high-order structure between nodes, and performs better than traditional "edge-based" solutions. In this paper, we study motif-path , which is conceptually a concatenation of one or more motif instances. We examine how motif-paths can be used in three path-based mining tasks, namely link prediction, local graph clustering and node ranking. We further address the situation when two graph nodes are not connected through a motif-path, and develop a novel defragmentation method to enhance it. Experimental results on real graph datasets demonstrate the use of motif-paths and defragmentation techniques improves graph analysis effectiveness. 
    more » « less
  4. null (Ed.)
    This paper will demonstrate a novel method for consolidating data in an engineered hypercube network for the purpose of optimizing query processing. Query processing typically calls for merging data collected from a small subset of server nodes in a network. This poses the problem of managing efficiently the exchange of data between processing nodes to complete some relational data operation. The method developed here is designed to minimize data transfer, measured as the product of data quantity and network distance, by delegating the processing to a node that is relatively central to the subset. A hypercube not only supports simple computation of network distance between nodes, but also allows for identifying a node to serve as the center for any data consolidation operations.We will show how the consolidation process can be performed by selecting a subgraph of a complex network to simplify the selection of a central node and thus facilitate the computations required. We will also show a prototype implementation of a hypercube using Software-Defined Networking to support query optimization in a distributed heterogeneous database system, making use of network distance information and data quantity. 
    more » « less
  5. Graphs/Networks are common in real-world applications where data have rich content and complex relationships. The increasing popularity also motivates many network learning algorithms, such as community detection, clustering, classification, and embedding learning, etc.. In reality, the large network volumes often hider a direct use of learning algorithms to the graphs. As a result, it is desirable to have the flexibility to condense a network to an arbitrary size, with well-preserved network topology and node content information. In this paper, we propose a graph compression network (GEN) to achieve network compression and embedding at the same time. Our theme is to leverage the network topology to find node mappings, such that densely connected nodes, including their node content, are compressed as a new node, with a latent vector (i.e. embedding) being learned to represent the compressed node. In addition to compression learning, we also develop a novel encoding-decoding framework, using feature diffusion process, to "decompress" the condensed network. Different from traditional graph convolution which uses direct-neighbor message passing, our decompression advocates high-order message passing within compressed nodes to learning feature representation for all nodes in the network. A unique strength of GEN is that it leverages the graph neural network principle to learn mapping automatically, so one can compress a network to an arbitrary size, and also decompress it to the original node space with minimum information loss. Experiments and comparisons confirm that GEN can automatically find clusters and communities, and compress them as new nodes. Results also show that GEN achieves improved performance for numerous tasks, including graph classification and node clustering. 
    more » « less