Accelerating AWP-ODC for large-scale earthquake simulations using MVAPICH2

Talreja, A; Palla, A; Roten, D; Zhou, Q; Xu, L; Cui, Y

Citation Details

AWP-ODC is a 4th-order finite difference code used for linear wave propagation, Iwan-type nonlinear dynamic rupture and wave propagation, and Strain Green Tensor simulation2. We have ported and verified the linear and topography version of AWP-ODC, with discontinuous mesh as well as topography, to HIP so that it can also run on AMD GPUs. The topography code achieved a 99.6% parallel efficiency on 4,096 nodes on Frontier, a Leadership Computing Facility at Oak Ridge National Laboratory. We have also implemented CUDA-aware features and on-the-fly GDR compression in the linear version of the ported HIP code. These enhancements significantly improve data transfer efficiency between GPUs, reducing communication overhead and boosting overall performance. We have also extended CUDA-aware features to the topography version and are actively working on incorporating GDR compression into this version as well. We see 154% benefits over IMPI in MVAPICH2-GDR with CUDA-aware support and on-the-fly compression for linear AWP-ODC on Lonestar-6 A100 nodes. Furthermore, we have successfully integrated a checkpointing feature into the nonlinear IWAN version of AWP-ODC, prepared for future extreme-scale simulation during Texascale Days of Frontera at TACC. more »

Award ID(s):: 2311833

PAR ID:: 10538017

Author(s) / Creator(s):: Talreja, A; Palla, A; Roten, D; Zhou, Q; Xu, L; Cui, Y

Publisher / Repository:: Annual MVAPICH User Group (MUG) Conference

Date Published:: 2024-08-19

Format(s):: Medium: X

Location:: Columbus

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Proceeding:
The DOI is not currently available.

More Like this