Modeling Utilization to Identify Shared-Memory Atomic Bottlenecks

Dong, Rongcui; Pai, Sreepathi

doi:10.1145/3725798.3725801

Citation Details

This content will become publicly available on March 1, 2026

Modeling Utilization to Identify Shared-Memory Atomic Bottlenecks

Performance analysis is critical for GPU programs with data-dependent behavior, but models like Roofline are not very useful for them and interpreting raw performance counters is tedious. In this work, we present an analytical model for shared memory atomics (fetch-and-op and compare-and-swap instructions on NVIDIA Volta and Ampere GPU) that allows users to immediately determine if shared memory atomic operations are a bottleneck for a program’s execution. Our model is based on modeling the architecture as a single-server queuing model whose inputs are performance counters. It captures load-dependent behavior such as pipelining, parallelism, and different access patterns. We embody this model in a tool that uses CUDA hardware counters as parameters to predict the utilization of the shared-memory atomic unit. To the best of our knowledge, no existing profiling tool or model provides this capability for shared-memory atomic operations. We used the model to compare two histogram kernels that use shared-memory atomics. Although nearly identical, their performance can be different by up to 30%. Our tool correctly identifies a bottleneck shift from shared-memory atomic unit as the cause of this discrepancy more »

Award ID(s):: 2144384

PAR ID:: 10589843

Author(s) / Creator(s):: Dong, Rongcui; Pai, Sreepathi

Publisher / Repository:: ACM

Date Published:: 2025-03-01

ISBN:: 9798400714917

Page Range / eLocation ID:: 14 to 20

Subject(s) / Keyword(s):: GPU Performance Modeling Shared Memory Atomic Instructions Queuing Theory

Format(s):: Medium: X

Location:: Las Vegas NV USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 1, 2026
Conference Paper:
https://doi.org/10.1145/3725798.3725801

More Like this