Modeling and Analyzing Evaluation Cost of CUDA Kernels

Muller, Stefan K; Hoffmann, Jan

doi:10.1145/3639403

Citation Details

Modeling and Analyzing Evaluation Cost of CUDA Kernels

Motivated by the increasing importance of general-purpose Graphic Processing Units (GPGPU) programming, exemplified by NVIDIA’s CUDA framework, as well as the difficulty, especially for novice programmers, of reasoning about performance in GPGPU kernels, we introduce a novel quantitative program logic for CUDA kernels. The logic allows programmers to reason about both functional correctness and resource usage of CUDA kernels, paying particular attention to a set of common but CUDA-specific performance bottlenecks: warp divergences, uncoalesced memory accesses, and bank conflicts. The logic is proved sound with respect to a novel operational cost semantics for CUDA kernels. The semantics, logic, and soundness proofs are formalized in Coq. An inference algorithm based on LP solving automatically synthesizes symbolic resource bounds by generating derivations in the logic. This algorithm is the basis of RaCUDA, an end-to-end resource-analysis tool for kernels, which has been implemented using an existing resource-analysis tool for imperative programs. An experimental evaluation on a suite of benchmarks shows that the analysis is effective in aiding the detection of performance bugs in CUDA kernels. more »

Award ID(s):: 2007784 1845514

PAR ID:: 10552306

Author(s) / Creator(s):: Muller, Stefan K; Hoffmann, Jan

Publisher / Repository:: ACM

Date Published:: 2024-03-31

Journal Name:: ACM Transactions on Parallel Computing

Volume:: 11

Issue:: 1

ISSN:: 2329-4949

Page Range / eLocation ID:: 1 to 53

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3639403

More Like this