skip to main content


Title: HP-GNN: Generating High Throughput GNN Training Implementation on CPU-FPGA Heterogeneous Platform
Award ID(s):
1911229 2009057
NSF-PAR ID:
10339663
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
The 30h ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA)
Page Range / eLocation ID:
123 to 133
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recently, graph neural networks (GNNs), as the backbone of graph-based machine learning, demonstrate great success in various domains (e.g., e-commerce). However, the performance of GNNs is usually unsatisfactory due to the highly sparse and irregular graph-based operations. To this end, we propose TC-GNN, the first GNN acceleration framework based on GPU Tensor Core Units (TCUs). The core idea is to reconcile the "Sparse" GNN computation with the high-performance "Dense" TCUs. Specifically, we conduct an in-depth analysis of the sparse operations in mainstream GNN computing frameworks. We introduce a novel sparse graph translation technique to facilitate TCU processing of the sparse GNN workload. We implement an effective CUDA core and TCU collaboration design to fully utilize GPU resources. We integrate MGG with the PyTorch framework for high programmability. Rigorous experiments show an average of 1.70× speedup over the state-of-the-art DGL framework across various models and datasets. 
    more » « less