GPU-aware support to improve the performance of collective communication on GPUs

Cui, Y; Palla, A; Zhang, T; Wang, S; Roten, D; Koesterke, L; Zhang, Z; Maechling, P

Citation Details

This content will become publicly available on September 10, 2026

GPU-aware support to improve the performance of collective communication on GPUs

We have implemented GPU-aware support across all AWP-ODC versions and enhanced message-passing collective communications for this memory-bound finite-difference solver. This provides cutting-edge communication support for production simulations on leadership-class computing facilities, including OLCF Frontier and TACC Vista. We achieved significant performance gains, reaching 37 sustained Petaflop/s and reducing time-to-solution by 17.2% using the GPU-aware feature on 8,192 Frontier nodes, or 65,336 MI250X GCDs. The AWP-ODC code has also been optimized for TACC Vista, an Arm-based NVIDIA GH200 Grace Hopper Superchip, demonstrating excellent application performance. This poster will showcase studies and GPU performance characteristics. We will discuss our verification of GPU-aware development and the use of high-performance MVAPICH libraries, including on-the-fly compression, on modern GPU clusters. more »

Award ID(s):: 2311833

PAR ID:: 10630255

Author(s) / Creator(s):: Cui, Y; Palla, A; Zhang, T; Wang, S; Roten, D; Koesterke, L; Zhang, Z; Maechling, P

Publisher / Repository:: SCEC Publications

Date Published:: 2025-09-10

Format(s):: Medium: X

Location:: Palm Springs

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on September 10, 2026
Conference Proceeding:
The DOI is not currently available.

More Like this