%AYan, Da%AGuo, Guimu%AChowdhury, Md%AÖzsu, Tamer%AKu, Wei-Shinn%ALui, John%D2020%I %K %MOSTI ID: 10140007 %PMedium: X %TG-thinker: A Distributed Framework for Mining Subgraphs in a Big Graph %XMining from a big graph those subgraphs that satisfy certain conditions is useful in many applications such as community detection and subgraph matching. These problems have a high time complexity, but existing systems to scale them are all IO-bound in execution. We propose the first truly CPU-bound distributed framework called G-thinker that adopts a user-friendly subgraph-centric vertex-pulling API for writing distributed subgraph mining algorithms. To utilize all CPU cores of a cluster, G-thinker features (1) a highly-concurrent vertex cache for parallel task access and (2) a lightweight task scheduling approach that ensures high task throughput. These designs well overlap communication with computation to minimize the CPU idle time. Extensive experiments demonstrate that G-thinker achieves orders of magnitude speedup compared even with the fastest existing subgraph-centric system, and it scales well to much larger and denser real network data. G-thinker is open-sourced at http://bit.ly/gthinker with detailed documentation. Country unknown/Code not availableOSTI-MSA