Controlled Kernel Launch for Dynamic Parallelism in GPUs

Tang, Xulong; Pattnaik, Ashutosh; Jiang, Huaipan; Kayiran, Onur; Jog, Adwait; Pai, Sreepathi; Ibrahim, Mohamed; Kandemir, Mahmut T.; Das, Chita R.

doi:10.1109/HPCA.2017.14

Citation Details

Controlled Kernel Launch for Dynamic Parallelism in GPUs

Dynamic parallelism (DP) is a promising feature for GPUs, which allows on-demand spawning of kernels on the GPU without any CPU intervention. However, this feature has two major drawbacks. First, the launching of GPU kernels can incur significant performance penalties. Second, dynamically-generated kernels are not always able to efficiently utilize the GPU cores due to hardware-limits. To address these two concerns cohesively, we propose SPAWN, a runtime framework that controls the dynamically-generated kernels, thereby directly reducing the associated launch overheads and queuing latency. Moreover, it allows a better mix of dynamically-generated and original (parent) kernels for the scheduler to effectively hide the remaining overheads and improve the utilization of the GPU resources. Our results show that, across 13 benchmarks, SPAWN achieves 69% and 57% speedup over the flat (non-DP) implementation and baseline DP, respectively. more »

Award ID(s):: 1657336

PAR ID:: 10048265

Author(s) / Creator(s):: Tang, Xulong; Pattnaik, Ashutosh; Jiang, Huaipan; Kayiran, Onur; Jog, Adwait; Pai, Sreepathi; Ibrahim, Mohamed; Kandemir, Mahmut T.; Das, Chita R.

Date Published:: 2017-02-01

Journal Name:: IEEE International Symposium on High Performance Computer Architecture (HPCA)

Page Range / eLocation ID:: 649 to 660

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1109/HPCA.2017.14

More Like this