skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 20, 2026

Title: Accelerating AWP-ODC for Large-scale Earthquake Simulations Using MVAPICH2
Accurate simulation of earthquake scenarios is essential for advancing seismic hazard analysis and risk mitigation strategies. At the San Diego Supercomputer Center (SDSC), our research focuses on optimizing the performance and reliability of large-scale earthquake simulations using the AWP-ODC software. By implementing GPU-aware MPI calls, we enable direct data processing within GPU memory, eliminating the need for explicit data transfers between CPU and GPU. This GPU-aware MPI achieves nearly ideal parallel efficiency at full scale across both Nvidia and AMD GPUs, leveraging the MVAPICH-PLUS support on Frontier at Oak Ridge National Laboratory and Vista at the Texas Advanced Computing Center. We utilized the MVAPICH-Plus 4.0 compiler to enable ZFP compression, which significantly enhances inter-node communication efficiency – a critical improvement given the communication bottleneck inherent in large-scale simulations. Our GPU-aware AWP-ODC versions include linear forward, topography and nonlinear Iwan-type solvers with discontinuous mesh support. On the Frontier system with MVAPICH 4.0, Hip-aware MPI calls on MI250X GPUs deliver nearly ideal weak-scaling speedup up to 8,192 nodes for both linear and topography versions. On TACC’s Vista system, CUDA-aware MPI calls on GH200 GPUs substantially outperform their non-GPU-aware counterparts across all three solver versions. This poster will present a detailed evaluation of GPU-aware AWP-ODC using MVAPICH, including the impact of ZFP message compression compared to the native versions. Our results highlight the importance of Mvapich support for GPU-ware MPI and on-the-fly compression techniques for accelerating and scaling earthquake simulations.  more » « less
Award ID(s):
2311833
PAR ID:
10630252
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Annual MVAPICH User Group (MUG) Conference
Date Published:
Format(s):
Medium: X
Location:
Columbus
Sponsoring Org:
National Science Foundation
More Like this
  1. We have implemented GPU-aware support across all AWP-ODC versions and enhanced message-passing collective communications for this memory-bound finite-difference solver. This provides cutting-edge communication support for production simulations on leadership-class computing facilities, including OLCF Frontier and TACC Vista. We achieved significant performance gains, reaching 37 sustained Petaflop/s and reducing time-to-solution by 17.2% using the GPU-aware feature on 8,192 Frontier nodes, or 65,336 MI250X GCDs. The AWP-ODC code has also been optimized for TACC Vista, an Arm-based NVIDIA GH200 Grace Hopper Superchip, demonstrating excellent application performance. This poster will showcase studies and GPU performance characteristics. We will discuss our verification of GPU-aware development and the use of high-performance MVAPICH libraries, including on-the-fly compression, on modern GPU clusters. 
    more » « less
  2. We have ported and verified the topography version of AWP-ODC, with discontinuous mesh feature enabled, to HIP so that it runs on AMD MI250X GPUs. 103.3% parallel efficiency was benchmarked on Frontier between 8 and 4,096 nodes or up to 32,768 GCDs. Frontier is a two exaflop/s computing system based on the AMD Radeon Instinct GPUs and EPYC CPUs, a Leadership Computing Facility at Oak Ridge National Laboratory (ORNL). This HIP topography code has been used in the production runs on Frontier, a primary computing engine currently utilizing the 2024 SCEC INCITE allocation, a 700K node-hours supercomputing time award. Furthermore, we implemented ROCm-Aware GPU direct support in the topo code, and demonstrated 14% additional reduction in time-to-solution up to 4,096 nodes. The AWP-ODC-Topo code is also tuned on TACC Vista, an Arm-based NVIDIA GH200 Grace Hopper Superchip, with excellent performance demonstrated. This poster will demonstrate the studies of weak scaling and the performance characteristics on GPUs. We discuss the efforts of verifying the ROCm-Aware development, and utilizing high-performance MVAPICH libraries with the on-the-fly compression on modern GPU clusters. 
    more » « less
  3. We integrate GPU-aware MVAPICH2 in AWP-ODC, a scalable finite difference code for wave propagation in nonlinear media. On OLCF Frontier, HIP-aware MVAPICH2 yields a 17.8% T2S improvement over the non-GPU-aware version and achieves 95% parallel efficiency on 65,536 AMD MI250X GCDs. On TACC Vista, CUDA-aware MVAPICH2 delivers a 3.5% performance gain across 2-256 Nvidia GH200 GPUs, with parallel efficiencies of 82% in the linear case and 92% in the computationally more intense nonlinear case. We deploy the code for production-scale earthquake simulations on leadership-class systems 
    more » « less
  4. AWP-ODC is a 4th-order finite difference code used for linear wave propagation, Iwan-type nonlinear dynamic rupture and wave propagation, and Strain Green Tensor simulation2. We have ported and verified the linear and topography version of AWP-ODC, with discontinuous mesh as well as topography, to HIP so that it can also run on AMD GPUs. The topography code achieved a 99.6% parallel efficiency on 4,096 nodes on Frontier, a Leadership Computing Facility at Oak Ridge National Laboratory. We have also implemented CUDA-aware features and on-the-fly GDR compression in the linear version of the ported HIP code. These enhancements significantly improve data transfer efficiency between GPUs, reducing communication overhead and boosting overall performance. We have also extended CUDA-aware features to the topography version and are actively working on incorporating GDR compression into this version as well. We see 154% benefits over IMPI in MVAPICH2-GDR with CUDA-aware support and on-the-fly compression for linear AWP-ODC on Lonestar-6 A100 nodes. Furthermore, we have successfully integrated a checkpointing feature into the nonlinear IWAN version of AWP-ODC, prepared for future extreme-scale simulation during Texascale Days of Frontera at TACC. 
    more » « less
  5. The Gordon Bell winning AWP-ODC application has a long history of boosted performance with MVAPICH on both CPU and GPU-based architectures. This talk will highlight a recent compression support implemented by the MVAPICH team, and its benefits to the large-scale earthquake simulation on the leadership class computing systems. The presentation will conclude with a discussion of the opportunities and technical challenges associated with the development of earthquake simulation software for Exascale computing. 
    more » « less