Given a userspecified minimum degree threshold γ, a γquasiclique is a subgraph where each vertex connects to at least γ fraction of the other vertices. Quasiclique is a natural definition for dense structures, so finding large and hence statistically significant quasicliques is useful in applications such as community detection in social networks and discovering significant biomolecule structures and pathways. However, mining maximal quasicliques is notoriously expensive, and even a recent algorithm for mining large maximal quasicliques is flawed and can lead to a lot of repeated searches. This paper proposes a parallel solution for mining maximal quasicliques that is able to fully utilize CPU cores. Our solution utilizes divide and conquer to decompose the workloads into independent tasks for parallel mining, and we addressed the problem of (i) drastic load imbalance among different tasks and (ii) difficulty in predicting the task running time and the time growth with task subgraph size, by (a) using a timeoutbased task decomposition strategy, and by (b) utilizing a priority task queue to schedule longrunning tasks earlier for mining and decomposition to avoid stragglers. Unlike our conference version in PVLDB 2020 where the solution was built on a distributed graph mining framework called Gthinker, this paper targets a singlemachine multicore environment which is more accessible to an average end user. A general framework called Tthinker is developed to facilitate the programming of parallel programs for algorithms that adopt divide and conquer, including but not limited to our quasiclique mining algorithm. Additionally, we consider the problem of directly mining large quasicliques from dense parts of a graph, where we identify the repeated search issue of a recent method and address it using a carefully designed concurrent trie data structure. Extensive experiments verify that our parallel solution scales well with the number of CPU cores, achieving 26.68× runtime speedup when mining a graph with 3.77M vertices and 16.5M edges with 32 mining threads. Additionally, mining large quasicliques from dense parts can provide an additional speedup of up to 89.46×.
more »
« less
YACC: A Framework Generalizing TuránShadow for Counting Large Cliques
Cliquecounting is a fundamental problem that has application in many areas eg. dense subgraph discovery, community detection, spam detection, etc. The problem of kcliquecounting is difficult because as k increases, the number of kcliques goes up exponentially. Enumeration algorithms (even parallel ones) fail to count kcliques beyond a small k. Approximation algorithms, like TuránShadow have been shown to perform well upto k = 10, but are inefficient for larger cliques. The recently proposed Pivoter algorithm significantly improved the stateoftheart and was able to give exact counts of all kcliques in a large number of graphs. However, the clique counts of some graphs (for example, comlj) are still out of reach of these algorithms.
We revisit the TuránShadow algorithm and propose a generalized framework called YACC that leverages several insights about realworld graphs to achieve faster cliquecounting. The bottleneck in TuránShadow is a recursive subroutine whose stopping condition is based on a classic result from extremal combinatorics called Turán's theorem. This theorem gives a lower bound for the kclique density in a subgraph in terms of its edge density. However, this stopping condition is based on a worstcase graph that does not reflect the nature of realworld graphs. Using techniques for quickly discovering dense subgraphs, we relax the stopping condition in a systematic way such that we get a smaller recursion tree while still maintaining the guarantees provided by TuránShadow. We deploy our algorithm on several realworld data sets and show that YACC reduces the size of the recursion tree and the running time by over an order of magnitude. Using YACC, we are able to obtain clique counts for several graphs for which cliquecounting was infeasible before, including comlj.
more »
« less
 NSFPAR ID:
 10380841
 Date Published:
 Journal Name:
 SDM2022
 Page Range / eLocation ID:
 684  692
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Quasicliques are a type of dense subgraphs that generalize the notion of cliques, important for applications such as community/module detection in various social and biological networks. However, the existing quasiclique definition and algorithms are only applicable to undirected graphs. In this paper, we generalize the concept of quasicliques to directed graphs by proposing $(\gamma_1, \gamma_2)$quasicliques which have density requirements in both inbound and outbound directions of each vertex in a quasiclique subgraph. An efficient recursive algorithm is proposed to find maximal $(\gamma_1, \gamma_2)$quasicliques which integrates many effective pruning rules that are validated by ablation studies. We also study the finding of top$k$ large quasicliques directly by bootstrapping the search from more compact quasicliques, to scale the mining to larger networks. The algorithms are parallelized with effective load balancing, and we demonstrate that they can scale up effectively with the number of CPU cores.more » « less

Over the last two decades, frameworks for distributedmemory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the defacto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting kcliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in realworld graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.more » « less

Finding from a big graph those subgraphs that satisfy certain conditions is useful in many applications such as community detection and subgraph matching. These problems have a high time complexity, but existing systems that attempt to scale them are all IObound in execution. We propose the first truly CPUbound distributed framework called Gthinker for subgraph finding algorithms, which adopts a taskbased computation model, and which also provides a userfriendly subgraphcentric vertexpulling API for writing distributed subgraph finding algorithms that can be easily adapted from existing serial algorithms. To utilize all CPU cores of a cluster, Gthinker features (1) a highly concurrent vertex cache for parallel task access and (2) a lightweight task scheduling approach that ensures high task throughput. These designs well overlap communication with computation to minimize the idle time of CPU cores. To further improve load balancing on graphs where the workloads of individual tasks can be drastically different due to biased graph density distribution, we propose to prioritize the scheduling of those tasks that tend to be long running for processing and decomposition, plus a timeout mechanism for task decomposition to prevent longrunning straggler tasks. The idea has been integrated into a novelty algorithm for maximum clique finding (MCF) that adopts a hybrid task decomposition strategy, which significantly improves the running time of MCF on dense and large graphs: The algorithm finds a maximum clique of size 1,109 on a large and dense WikiLinks graph dataset in 70 minutes. Extensive experiments demonstrate that Gthinker achieves orders of magnitude speedup compared even with the fastest existing subgraphcentric system, and it scales well to much larger and denser real network data. Gthinker is opensourced at http://bit.ly/gthinker with detailed documentation.more » « less

Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the fivecycle, which is an important special case of fivevertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel fivecycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low outdegree orientations, which enables the efficient computation of directed twopaths and threepaths, and the algorithms differ in the ways in which they use this orientation to eliminate doublecounting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of fivecycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of realworld graphs using a 36core machine with twoway hyperthreading show that our best exact parallel algorithm achieves 10–46× selfrelative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous stateoftheart serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× selfrelative speedup and is able to approximate fivecycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.more » « less