The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering

Yu, Shangdi; Shi, Jessica; Meindl, Jamison; Eisenstat, David; Ju, Xiaoen; Tavakkol, Sasan; Dhulipala, Laxman; Łącki, Jakub; Mirrokni, Vahab; Shun, Julian

doi:10.14778/3712221.3712246

Citation Details

The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering

We introduce the ParClusterers Benchmark Suite (PCBS)---a collection of highly scalable parallel graph clustering algorithms and benchmarking tools that streamline comparing different graph clustering algorithms and implementations. The benchmark includes clustering algorithms that target a wide range of modern clustering use cases, including community detection, classification, and dense subgraph mining. The benchmark toolkit makes it easy to run and evaluate multiple instances of different clustering algorithms with respect to both the running time and quality. We evaluate the PCBS algorithms empirically and find that they deliver both the state of the art quality and the running time. In terms of the running time, they are on average over 4x faster than the fastest library we compared to. In terms of quality, the correlation clustering algorithm [Shi et al., VLDB'21] optimizing for the LambdaCC objective, which does not have a direct counterpart in other libraries, delivers the highest quality in the majority of datasets that we used. more »