This work presents inGRASS, a novel algorithm designed for incremental spectral sparsification of large undirected graphs. The proposed inGRASS algorithm is highly scalable and parallel-friendly, having a nearly linear time complexity for the setup phase and the ability to update the spectral sparsifier in O(logN) time for each incremental change made to the original graph with N nodes. A key component in the setup phase of inGRASS is a multilevel resistance embedding framework introduced for efficiently identifying spectrally-critical edges and effectively detecting redundant ones, which is achieved by decomposing the initial sparsifier into many node clusters with bounded effective-resistance diameters leveraging a low-resistance-diameter decomposition (LRD) scheme. The update phase of inGRASS exploits low-dimensional node embedding vectors for efficiently estimating the importance and uniqueness of each newly added edge. As demonstrated through extensive experiments, inGRASS achieves up to over 200× speedups while retaining comparable solution quality in incremental spectral sparsification of graphs obtained from various datasets, such as circuit simulations, finite element analysis, and social networks. 
                        more » 
                        « less   
                    
                            
                            Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening
                        
                    
    
            This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer" graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2021309
- PAR ID:
- 10252376
- Date Published:
- Journal Name:
- ACM International Conference on Web Search and Data Mining
- Page Range / eLocation ID:
- 869 to 877
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Diffusion-based graph generative models are effective in generating high-quality small graphs. However, it is hard to scale them to large graphs that contain thousands of nodes. In this work, we propose EDGE, a new diffusion-based graph generative model that addresses generative tasks for large graphs. The model is developed by reversing a discrete diffusion process that randomly removes edges until obtaining an empty graph. It leverages graph sparsity in the diffusion process to improve computational efficiency. In particular, EDGE only focuses on a small portion of graph nodes and only adds edges between these nodes. Without compromising modeling ability, it makes much fewer edge predictions than previous diffusion-based generative models. Furthermore, EDGE can explicitly model the node degrees of training graphs and then gain performance improvement in capturing graph statistics. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by the proposed model have graph statistics more similar to those of training graphs.more » « less
- 
            Diffusion-based graph generative models are effective in generating high-quality small graphs. However, it is hard to scale them to large graphs that contain thousands of nodes. In this work, we propose EDGE, a new diffusion-based graph generative model that addresses generative tasks for large graphs. The model is developed by reversing a discrete diffusion process that randomly removes edges until obtaining an empty graph. It leverages graph sparsity in the diffusion process to improve computational efficiency. In particular, EDGE only focuses on a small portion of graph nodes and only adds edges between these nodes. Without compromising modeling ability, it makes much fewer edge predictions than previous diffusion-based generative models. Furthermore, EDGE can explicitly model the node degrees of training graphs and then gain performance improvement in capturing graph statistics. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by the proposed model have graph statistics more similar to those of training graphs.more » « less
- 
            null (Ed.)Recent spectral graph sparsification techniques have shown promising performance in accelerating many numerical and graph algorithms, such as iterative methods for solving large sparse matrices, spectral partitioning of undirected graphs, vectorless verification of power/thermal grids, representation learning of large graphs, etc. However, prior spectral graph sparsification methods rely on fast Laplacian matrix solvers that are usually challenging to implement in practice. This work, for the first time, introduces a solver-free approach (SF-GRASS) for spectral graph sparsification by leveraging emerging spectral graph coarsening and graph signal processing (GSP) techniques. We introduce a local spectral embedding scheme for efficiently identifying spectrally-critical edges that are key to preserving graph spectral properties, such as the first few Laplacian eigenvalues and eigenvectors. Since the key kernel functions in SF-GRASS can be efficiently implemented using sparse-matrix-vector-multiplications (SpMVs), the proposed spectral approach is simple to implement and inherently parallel friendly. Our extensive experimental results show that the proposed method can produce a hierarchy of high-quality spectral sparsifiers in nearly-linear time for a variety of real-world, large-scale graphs and circuit networks when compared with prior state-of-the-art spectral methods.more » « less
- 
            State-of-the-art hypergraph partitioners follow the multilevel paradigm that constructs multiple levels of progressively coarser hypergraphs that are used to drive cut refinements on each level of the hierarchy. Multilevel partitioners are subject to two limitations: (i) Hypergraph coarsening processes rely on local neighborhood structure without fully considering the global structure of the hypergraph. (ii) Refinement heuristics can stagnate on local minima. In this paper, we describe SpecPart, the first supervised spectral framework that directly tackles these two limitations. SpecPart solves a generalized eigenvalue problem that captures the balanced partitioning objective and global hypergraph structure in a low-dimensional vertex embedding while leveraging initial high-quality solutions from multilevel partitioners as hints. SpecPart further constructs a family of trees from the vertex embedding and partitions them with a tree-sweeping algorithm. Then, a novel overlay of multiple tree-based partitioning solutions, followed by lifting to a coarsened hypergraph, where an ILP partitioning instance is solved to alleviate local stagnation. We have validated SpecPart on multiple sets of benchmarks. Experimental results show that for some benchmarks, our SpecPart can substantially improve the cutsize by more than 50% with respect to the best published solutions obtained with leading partitioners hMETIS and KaHyPar.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    