skip to main content


Title: Quasi-Stable Coloring for Graph Compression: Approximating Max-Flow, Linear Programs, and Centrality
We propose quasi-stable coloring , an approximate version of stable coloring. Stable coloring, also called color refinement, is a well-studied technique in graph theory for classifying vertices, which can be used to build compact, lossless representations of graphs. However, its usefulness is limited due to its reliance on strict symmetries. Real data compresses very poorly using color refinement. We propose the first, to our knowledge, approximate color refinement scheme, which we call quasi-stable coloring. By using approximation, we alleviate the need for strict symmetry, and allow for a tradeoff between the degree of compression and the accuracy of the representation. We study three applications: Linear Programming, Max-Flow, and Betweenness Centrality, and provide theoretical evidence in each case that a quasi-stable coloring can lead to good approximations on the reduced graph. Next, we consider how to compute a maximal quasi-stable coloring: we prove that, in general, this problem is NP-hard, and propose a simple, yet effective algorithm based on heuristics. Finally, we evaluate experimentally the quasi-stable coloring technique on several real graphs and applications, comparing with prior approximation techniques.  more » « less
Award ID(s):
2109922 1907997
NSF-PAR ID:
10428132
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the VLDB Endowment
Volume:
16
Issue:
4
ISSN:
2150-8097
Page Range / eLocation ID:
803 to 815
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In numerous graph signal processing applications, data is often missing for a variety of reasons, and predicting the missing data is essential. In this paper, we consider data on graphs modeled as bandlimited graph signals. Predicting or reconstructing the unknown signal values for such a model requires an estimate of the signal bandwidth. In this paper, we address the problem of estimating the reconstruction errors, minimizing which would thereby provide an estimate of the signal bandwidth. In doing so, we design a cross-validation approach needed for stable graph signal reconstruction and propose a method for estimating the reconstruction errors for different choices of signal bandwidth. Using this technique, we are able to estimate the reconstruction error on a variety of real-world graphs. 
    more » « less
  2. null (Ed.)
    The claw is the graph $K_{1,3}$, and the fork is the graph obtained from the claw $K_{1,3}$ by subdividing one of its edges once. In this paper, we prove a structure theorem for the class of (claw, $C_4$)-free graphs that are not quasi-line graphs, and a structure theorem for the class of (fork, $C_4$)-free graphs that uses the class of (claw, $C_4$)-free graphs as a basic class. Finally, we show that every (fork, $C_4$)-free graph $G$ satisfies $\chi(G)\leqslant \lceil\frac{3\omega(G)}{2}\rceil$ via these structure theorems with some additional work on coloring basic classes. 
    more » « less
  3. Markov Chain Monte Carlo (MCMC) has been the de facto technique for sampling and inference of large graphs such as online social networks. At the heart of MCMC lies the ability to construct an ergodic Markov chain that attains any given stationary distribution \pi, often in the form of random walks or crawling agents on the graph. Most of the works around MCMC, however, presume that the graph is undirected or has reciprocal edges, and become inapplicable when the graph is directed and non-reciprocal. Here we develop a similar framework for directed graphs called Non- Markovian Monte Carlo (NMMC) by establishing a mapping to convert \pi into the quasi-stationary distribution of a carefully constructed transient Markov chain on an extended state space. As applications, we demonstrate how to achieve any given distribution \pi on a directed graph and estimate the eigenvector centrality using a set of non-Markovian, history-dependent random walks on the same graph in a distributed manner.We also provide numerical results on various real-world directed graphs to confirm our theoretical findings, and present several practical enhancements to make our NMMC method ready for practical use inmost directed graphs. To the best of our knowledge, the proposed NMMC framework for directed graphs is the first of its kind, unlocking all the limitations set by the standard MCMC methods for undirected graphs. 
    more » « less
  4. Graph coloring assigns a color to each vertex of a graph such that no two adjacent vertices get the same color. It is a key building block in many applications. In practice, solutions that require fewer distinct colors and that can be computed faster are typically preferred. Various coloring heuristics exist that provide different quality versus speed tradeoffs. The highest-quality heuristics tend to be slow. To improve performance, several parallel implementations have been proposed. This paper describes two improvements of the widely used LDF heuristic. First, we present a “shortcutting” approach to increase the parallelism by non-speculatively breaking data dependencies. Second, we present “color reduction” techniques to boost the solution of LDF. On 18 graphs from various domains, the shortcutting approach yields 2.5 times more parallelism in the mean, and the color-reduction techniques improve the result quality by up to 20%. Our deterministic CUDA implementation running on a Titan V is 2.9 times faster in the mean and uses as few or fewer colors as the best GPU codes from the literature. 
    more » « less
  5. Demeniconi, C. ; Davidson, I (Ed.)
    Many irregular domains such as social networks, financial transactions, neuron connections, and natural language constructs are represented using graph structures. In recent years, a variety of graph neural networks (GNNs) have been successfully applied for representation learning and prediction on such graphs. In many of the real-world applications, the underlying graph changes over time, however, most of the existing GNNs are inadequate for handling such dynamic graphs. In this paper we propose a novel technique for learning embeddings of dynamic graphs using a tensor algebra framework. Our method extends the popular graph convolutional network (GCN) for learning representations of dynamic graphs using the recently proposed tensor M-product technique. Theoretical results presented establish a connection between the proposed tensor approach and spectral convolution of tensors. The proposed method TM-GCN is consistent with the Message Passing Neural Network (MPNN) framework, accounting for both spatial and temporal message passing. Numerical experiments on real-world datasets demonstrate the performance of the proposed method for edge classification and link prediction tasks on dynamic graphs. We also consider an application related to the COVID-19 pandemic, and show how our method can be used for early detection of infected individuals from contact tracing data. 
    more » « less