skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

Title: Improved Dynamic Graph Learning through Fault-Tolerant Sparsification
Graph sparsification has been used to improve the computational cost of learning over graphs, e.g., Laplacian-regularized estimation and graph semi-supervised learning (SSL). However, when graphs vary over time, repeated sparsification requires polynomial order computational cost per update. We propose a new type of graph sparsification namely fault-tolerant (FT) sparsification to significantly reduce the cost to only a constant. Then the computational cost of subsequent graph learning tasks can be significantly improved with limited loss in their accuracy. In particular, we give theoretical analyze to upper bound the loss in the accuracy of the subsequent Laplacian-regularized estimation and graph SSL, due to the FT sparsification. In addition, FT spectral sparsification can be generalized to FT cut sparsification, for cut-based graph learning. Extensive experiments have confirmed the computational efficiencies and accuracies of the proposed methods for learning on dynamic graphs.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 36th International Conference on Machine Learning
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernel-based algorithms need to materialize the full n × n kernel matrix before performing any subsequent computation, thus incurring Ω(n^2) runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain subquadratic time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving lin- ear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recently developed Kernel Density Estimation framework, which (after preprocessing in time subquadratic in n) can return estimates of row/column sums of the kernel matrix. In particular, we de- velop efficient reductions from weighted vertex and weighted edge sampling on kernel graphs, simulating random walks on kernel graphs, and importance sampling on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in sublinear (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsi- fication, where we observe a 9x decrease in the number of kernel evaluations over baselines for LRA and a 41x reduction in the graph size for spectral sparsification. 
    more » « less
  2. Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. However, it remains notoriously challenging to train and inference GCNs over large graph datasets, limiting their application to large real-world graphs and hindering the exploration of deeper and more sophisticated GCN graphs. This is because as the graph size grows, the sheer number of node features and the large adjacency matrix can easily explode the required memory and data movements. To tackle the aforementioned challenges, we explore the possibility of drawing lottery tickets when sparsifying GCN graphs, i.e., subgraphs that largely shrink the adjacency matrix yet are capable of achieving accuracy comparable to or even better than their full graphs. Specifically, we for the first time discover the existence of graph early-bird (GEB) tickets that emerge at the very early stage when sparsifying GCN graphs, and propose a simple yet effective detector to automatically identify the emergence of such GEB tickets. Furthermore, we advocate graph-model co-optimization and develop a generic efficient GCN early-bird training framework dubbed GEBT that can significantly boost the efficiency of GCN training by (1) drawing joint early-bird tickets between the GCN graphs and models and (2) enabling simultaneously sparsification of both the GCN graphs and models. Experiments on various GCN models and datasets consistently validate our GEB finding and the effectiveness of our GEBT, e.g., our GEBT achieves up to 80.2% ~ 85.6% and 84.6% ~ 87.5% savings of GCN training and inference costs while offering a comparable or even better accuracy as compared to state-of-the-art methods. Our source code and supplementary appendix are available at 
    more » « less
  3. null (Ed.)
    Recent spectral graph sparsification techniques have shown promising performance in accelerating many numerical and graph algorithms, such as iterative methods for solving large sparse matrices, spectral partitioning of undirected graphs, vectorless verification of power/thermal grids, representation learning of large graphs, etc. However, prior spectral graph sparsification methods rely on fast Laplacian matrix solvers that are usually challenging to implement in practice. This work, for the first time, introduces a solver-free approach (SF-GRASS) for spectral graph sparsification by leveraging emerging spectral graph coarsening and graph signal processing (GSP) techniques. We introduce a local spectral embedding scheme for efficiently identifying spectrally-critical edges that are key to preserving graph spectral properties, such as the first few Laplacian eigenvalues and eigenvectors. Since the key kernel functions in SF-GRASS can be efficiently implemented using sparse-matrix-vector-multiplications (SpMVs), the proposed spectral approach is simple to implement and inherently parallel friendly. Our extensive experimental results show that the proposed method can produce a hierarchy of high-quality spectral sparsifiers in nearly-linear time for a variety of real-world, large-scale graphs and circuit networks when compared with prior state-of-the-art spectral methods. 
    more » « less
  4. Computing similarity between graphs is a fundamental and critical problem in graph-based applications, and one of the most commonly used graph similarity measures is graph edit distance (GED), defined as the minimum number of graph edit operations that transform one graph to another. Existing GED solutions suffer from severe performance issues due in particular to the NP-hardness of exact GED computation. Recently, deep learning has shown early promise for GED approximation with high accuracy and low computational cost. However, existing methods treat GED as a global, coarse-grained graph similarity value, while neglecting the type-specific transformative impacts incurred by different types of graph edit operations, including node insertion/deletion, node relabeling, edge insertion/deletion, and edge relabeling. In this paper, we propose a type-aware graph similarity learning and computation framework, TaGSim (T ype -a ware G raph Sim ilarity), that estimates GED in a fine-grained approach w.r.t. different graph edit types. Specifically, for each type of graph edit operations, TaGSim models its unique transformative impacts upon graphs, and encodes them into high-quality, type-aware graph embeddings, which are further fed into type-aware neural networks for accurate GED estimation. Extensive experiments on five real-world datasets demonstrate the effectiveness and efficiency of TaGSim, which significantly outperforms state-of-the-art GED solutions. 
    more » « less
  5. Recently there has been significant interest in using machine learning to improve the accuracy of cardinality estimation. This work has focused on improving average estimation error, but not all estimates matter equally for downstream tasks like query optimization. Since learned models inevitably make mistakes, the goal should be to improve the estimates that make the biggest difference to an optimizer. We introduce a new loss function, Flow-Loss, for learning cardinality estimation models. Flow-Loss approximates the optimizer's cost model and search algorithm with analytical functions, which it uses to optimize explicitly for better query plans. At the heart of Flow-Loss is a reduction of query optimization to a flow routing problem on a certain "plan graph", in which different paths correspond to different query plans. To evaluate our approach, we introduce the Cardinality Estimation Benchmark (CEB) which contains the ground truth cardinalities for sub-plans of over 16 K queries from 21 templates with up to 15 joins. We show that across different architectures and databases, a model trained with Flow-Loss improves the plan costs and query runtimes despite having worse estimation accuracy than a model trained with Q-Error. When the test set queries closely match the training queries, models trained with both loss functions perform well. However, the Q-Error-trained model degrades significantly when evaluated on slightly different queries (e.g., similar but unseen query templates), while the Flow-Loss-trained model generalizes better to such situations, achieving 4 -- 8× better 99th percentile runtimes on unseen templates with the same model architecture and training data. 
    more » « less