Automating CUDA Synchronization via Program Transformation

Wu, Mingyuan; Zhang, Lingming; Liu, Cong; Tan, Shin Hwei; Zhang, Yuqun

doi:10.1109/ASE.2019.00075

Citation Details

Automating CUDA Synchronization via Program Transformation

While CUDA has been the most popular parallel computing platform and programming model for general purpose GPU computing, CUDA synchronization undergoes significant challenges for GPU programmers due to its intricate parallel computing mechanism and coding practices. In this paper, we propose AuCS, the first general framework to automate synchronization for CUDA kernel functions. AuCS transforms the original LLVM-level CUDA program control flow graph in a semantic-preserving manner for exploring the possible barrier function locations. Accordingly, AuCS develops mechanisms to correctly place barrier functions for automating synchronization in multiple erroneous (challenging-to-be-detected) synchronization scenarios, including data race, barrier divergence, and redundant barrier functions. To evaluate the effectiveness and efficiency of AuCS, we conduct an extensive set of experiments and the results demonstrate that AuCS can automate 20 out of 24 erroneous synchronization scenarios. more »

Award ID(s):: 1763906

PAR ID:: 10175521

Author(s) / Creator(s):: Wu, Mingyuan; Zhang, Lingming; Liu, Cong; Tan, Shin Hwei; Zhang, Yuqun

Date Published:: 2019-11-01

Journal Name:: IEEE/ACM International Conference on Automated Software Engineering

Page Range / eLocation ID:: 748 to 759

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ASE.2019.00075

More Like this