This work presents inGRASS, a novel algorithm designed for incremental spectral sparsification of large undirected graphs. The proposed inGRASS algorithm is highly scalable and parallel-friendly, having a nearly linear time complexity for the setup phase and the ability to update the spectral sparsifier in O(logN) time for each incremental change made to the original graph with N nodes. A key component in the setup phase of inGRASS is a multilevel resistance embedding framework introduced for efficiently identifying spectrally-critical edges and effectively detecting redundant ones, which is achieved by decomposing the initial sparsifier into many node clusters with bounded effective-resistance diameters leveraging a low-resistance-diameter decomposition (LRD) scheme. The update phase of inGRASS exploits low-dimensional node embedding vectors for efficiently estimating the importance and uniqueness of each newly added edge. As demonstrated through extensive experiments, inGRASS achieves up to over 200× speedups while retaining comparable solution quality in incremental spectral sparsification of graphs obtained from various datasets, such as circuit simulations, finite element analysis, and social networks.
more »
« less
Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening
This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer" graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets.
more »
« less
- Award ID(s):
- 2021309
- NSF-PAR ID:
- 10252376
- Date Published:
- Journal Name:
- ACM International Conference on Web Search and Data Mining
- Page Range / eLocation ID:
- 869 to 877
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Diffusion-based graph generative models are effective in generating high-quality small graphs. However, it is hard to scale them to large graphs that contain thousands of nodes. In this work, we propose EDGE, a new diffusion-based graph generative model that addresses generative tasks for large graphs. The model is developed by reversing a discrete diffusion process that randomly removes edges until obtaining an empty graph. It leverages graph sparsity in the diffusion process to improve computational efficiency. In particular, EDGE only focuses on a small portion of graph nodes and only adds edges between these nodes. Without compromising modeling ability, it makes much fewer edge predictions than previous diffusion-based generative models. Furthermore, EDGE can explicitly model the node degrees of training graphs and then gain performance improvement in capturing graph statistics. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by the proposed model have graph statistics more similar to those of training graphs.more » « less
-
Diffusion-based graph generative models are effective in generating high-quality small graphs. However, it is hard to scale them to large graphs that contain thousands of nodes. In this work, we propose EDGE, a new diffusion-based graph generative model that addresses generative tasks for large graphs. The model is developed by reversing a discrete diffusion process that randomly removes edges until obtaining an empty graph. It leverages graph sparsity in the diffusion process to improve computational efficiency. In particular, EDGE only focuses on a small portion of graph nodes and only adds edges between these nodes. Without compromising modeling ability, it makes much fewer edge predictions than previous diffusion-based generative models. Furthermore, EDGE can explicitly model the node degrees of training graphs and then gain performance improvement in capturing graph statistics. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by the proposed model have graph statistics more similar to those of training graphs.more » « less
-
null (Ed.)Recent spectral graph sparsification techniques have shown promising performance in accelerating many numerical and graph algorithms, such as iterative methods for solving large sparse matrices, spectral partitioning of undirected graphs, vectorless verification of power/thermal grids, representation learning of large graphs, etc. However, prior spectral graph sparsification methods rely on fast Laplacian matrix solvers that are usually challenging to implement in practice. This work, for the first time, introduces a solver-free approach (SF-GRASS) for spectral graph sparsification by leveraging emerging spectral graph coarsening and graph signal processing (GSP) techniques. We introduce a local spectral embedding scheme for efficiently identifying spectrally-critical edges that are key to preserving graph spectral properties, such as the first few Laplacian eigenvalues and eigenvectors. Since the key kernel functions in SF-GRASS can be efficiently implemented using sparse-matrix-vector-multiplications (SpMVs), the proposed spectral approach is simple to implement and inherently parallel friendly. Our extensive experimental results show that the proposed method can produce a hierarchy of high-quality spectral sparsifiers in nearly-linear time for a variety of real-world, large-scale graphs and circuit networks when compared with prior state-of-the-art spectral methods.more » « less
-
We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces computation and memory requirements. For multiple GPUs, in distributed environments, we propose priority-based ordering of vertices to reduce the communication time. We prove the correctness of our algorithms and experimentally demonstrate that for graphs of over 16 million vertices and over 134 million edges on a single GPU, our dynamic algorithm is as much as 20x faster than state-of-the-art algorithm on static graphs. For larger graphs with over 130 million vertices and over 260 million edges, our distributed implementation with 8 GPUs produces updated color assignments within 160 milliseconds. In all cases, the proposed parallel algorithms produce comparable or fewer colors than state-of-the-art algorithms.more » « less