Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the fivecycle, which is an important special case of fivevertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel fivecycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low outdegree orientations, which enables the efficient computation of directed twopaths and threepaths, and the algorithms differ in the ways in which they use this orientation to eliminate doublecounting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of fivecycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of realworld graphs using a 36core machine with twoway hyperthreading show that our best exact parallel algorithm achieves 10–46× selfrelative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous stateoftheart serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× selfrelative speedup and is able to approximate fivecycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.
more »
« less
Parallel Algorithms for Butterfly Computations
Butterflies are the smallest nontrivial subgraph in bipartite graphs, and therefore having efficient computations for analyzing them is crucial to improving the quality of certain applications on bipartite graphs. In this paper, we design a framework called ParButterfly that contains new parallel algorithms for the following problems on processing butterflies: global counting, pervertex counting, peredge counting, tip decomposition (vertex peeling), and wing decomposition (edge peeling). The main component of these algorithms is aggregating wedges incident on subsets of vertices, and our framework supports different methods for wedge aggregation, including sorting, hashing, histogramming, and batching. In addition, ParButterfly supports different ways of ranking the vertices to speed up counting, including side ordering, approximate and exact degree ordering, and approximate and exact complement coreness ordering. For counting, ParButterfly also supports both exact computation as well as approximate computation via graph sparsification. We prove strong theoretical guarantees on the work and span of the algorithms in ParButterfly.
We perform a comprehensive evaluation of all of the algorithms in ParButterfly on a collection of realworld bipartite graphs using a 48core machine. Our counting algorithms obtain significant parallel speedup, outperforming the fastest sequential algorithms by up to 13.6× with a selfrelative speedup of up to 38.5×. Compared to general subgraph counting solutions, we are orders of magnitude faster. Our peeling algorithms achieve selfrelative speedups of up to 10.7× and outperform the fastest sequential baseline by up to several orders of magnitude.
more »
« less
 Award ID(s):
 1845763
 NSFPAR ID:
 10137063
 Date Published:
 Journal Name:
 Symposium on Algorithmic Principles of Computer Systems
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Butterflies are an important motif found in bipartite graphs that provide a structural way for finding dense regions within the graph. Beyond counting butterflies and enumerating them, other metrics and peeling for bipartite graphs are designed around counting butterfly motifs. The importance of counting butterflies has led to many works on efficient implementations for butterfly counting, given certain situational or hardware constraints. However, most algorithms are based on first counting the building block of the butterfly motif, and from that calculating the total possible number of butterflies in the graph. In this paper, using a linear algebra approach, we show that many provably correct algorithms for counting butterflies can be systematically derived. Moreover, we show how this formulation facilitates butterfly peeling algorithms that find the ktip and kwing subgraphs within a bipartite graph.more » « less

Finding from a big graph those subgraphs that satisfy certain conditions is useful in many applications such as community detection and subgraph matching. These problems have a high time complexity, but existing systems that attempt to scale them are all IObound in execution. We propose the first truly CPUbound distributed framework called Gthinker for subgraph finding algorithms, which adopts a taskbased computation model, and which also provides a userfriendly subgraphcentric vertexpulling API for writing distributed subgraph finding algorithms that can be easily adapted from existing serial algorithms. To utilize all CPU cores of a cluster, Gthinker features (1) a highly concurrent vertex cache for parallel task access and (2) a lightweight task scheduling approach that ensures high task throughput. These designs well overlap communication with computation to minimize the idle time of CPU cores. To further improve load balancing on graphs where the workloads of individual tasks can be drastically different due to biased graph density distribution, we propose to prioritize the scheduling of those tasks that tend to be long running for processing and decomposition, plus a timeout mechanism for task decomposition to prevent longrunning straggler tasks. The idea has been integrated into a novelty algorithm for maximum clique finding (MCF) that adopts a hybrid task decomposition strategy, which significantly improves the running time of MCF on dense and large graphs: The algorithm finds a maximum clique of size 1,109 on a large and dense WikiLinks graph dataset in 70 minutes. Extensive experiments demonstrate that Gthinker achieves orders of magnitude speedup compared even with the fastest existing subgraphcentric system, and it scales well to much larger and denser real network data. Gthinker is opensourced at http://bit.ly/gthinker with detailed documentation.more » « less

We consider the maximum vertexweighted matching problem (MVM), in which nonnegative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edgeweighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on nonbipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at most 2k + 1 or weightincreasing paths of length at most 2k. The choice of k = 2 leads to a 2/3approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3approximate maximum cardinality matching to reduce its runtime in practice. A 1/2approximation algorithm may be obtained using k = 1, which is faster than the 2/3approximation algorithm but it computes lower weights. The 2/3approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2approximation algorithm) while the 2/3approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are wellsuited for parallel implementation since they can process vertices to match in any order. The 1/2 and 2/3approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is livelock free.more » « less

Over the last two decades, frameworks for distributedmemory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the defacto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting kcliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in realworld graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.more » « less