skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 16 until 2:00 AM ET on Saturday, May 17 due to maintenance. We apologize for the inconvenience.


Title: GRaM-X: a new GPU-accelerated dynamical spacetime GRMHD code for Exascale computing with the Einstein Toolkit
Abstract We presentGRaM-X(GeneralRelativisticacceleratedMagnetohydrodynamics on AMReX), a new GPU-accelerated dynamical-spacetime general relativistic magnetohydrodynamics (GRMHD) code which extends the GRMHD capability of Einstein Toolkit to GPU-based exascale systems.GRaM-Xsupports 3D adaptive mesh refinement (AMR) on GPUs via a new AMR driver for the Einstein Toolkit calledCarpetXwhich in turn leveragesAMReX, an AMR library developed for use by the United States DOE’s Exascale Computing Project. We use the Z4c formalism to evolve the Einstein equations and the Valencia formulation to evolve the equations of GRMHD.GRaM-Xsupports both analytic as well as tabulated equations of state. We implement TVD and WENO reconstruction methods as well as the HLLE Riemann solver. We test the accuracy of the code using a range of tests on static spacetime, e.g. 1D magnetohydrodynamics shocktubes, the 2D magnetic rotor and a cylindrical explosion, as well as on dynamical spacetimes, i.e. the oscillations of a 3D Tolman-Oppenheimer-Volkhof star. We find excellent agreement with analytic results and results of other codes reported in literature. We also perform scaling tests and find thatGRaM-Xshows a weak scaling efficiency of ∼40%–50% on 2304 nodes (13824 NVIDIA V100 GPUs) with respect to single-node performance on OLCF’s supercomputer Summit.  more » « less
Award ID(s):
2004879
PAR ID:
10463814
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IOP Publishing
Date Published:
Journal Name:
Classical and Quantum Gravity
Volume:
40
Issue:
20
ISSN:
0264-9381
Page Range / eLocation ID:
Article No. 205009
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present GRaM-X (General Relativistic accelerated Magnetohydrodynamics on AMReX), a new GPU-accelerated dynamical-spacetime general relativistic magnetohydrodynamics (GRMHD) code which extends the GRMHD capability of Einstein Toolkit to GPU-based exascale systems. GRaM-X supports 3D adaptive mesh refinement (AMR) on GPUs via a new AMR driver for the Einstein Toolkit called CarpetX which in turn leverages AMReX, an AMR library developed for use by the United States DOE's Exascale Computing Project. We use the Z4c formalism to evolve the Einstein equations and the Valencia formulation to evolve the equations of GRMHD. GRaM-X supports both analytic as well as tabulated equations of state. We implement TVD and WENO reconstruction methods as well as the HLLE Riemann solver. We test the accuracy of the code using a range of tests on static spacetime, e.g. 1D magnetohydrodynamics shocktubes, the 2D magnetic rotor and a cylindrical explosion, as well as on dynamical spacetimes, i.e. the oscillations of a 3D Tolman-Oppenheimer-Volkhof star. We find excellent agreement with analytic results and results of other codes reported in literature. We also perform scaling tests and find that GRaM-X shows a weak scaling efficiency of ∼40%–50% on 2304 nodes (13824 NVIDIA V100 GPUs) with respect to single-node performance on OLCF's supercomputer Summit. 
    more » « less
  2. Abstract We presentAsterX, a novel open-source, modular, GPU-accelerated, fully general relativistic magnetohydrodynamic (GRMHD) code designed for dynamic spacetimes in 3D Cartesian coordinates, and tailored for exascale computing. We utilize block-structured adaptive mesh refinement (AMR) throughCarpetX, the new driver for theEinstein Toolkit, which is built onAMReX, a software framework for massively parallel applications.AsterXemploys the Valencia formulation for GRMHD, coupled with the ‘Z4c’ formalism for spacetime evolution, while incorporating high resolution shock capturing schemes to accurately handle the hydrodynamics.AsterXhas undergone rigorous testing in both static and dynamic spacetime, demonstrating remarkable accuracy and agreement with other codes in literature. Using subcycling in time, we find an overall performance gain of factor 2.5–4.5. Benchmarking the code through scaling tests on OLCF’s Frontier supercomputer, we demonstrate a weak scaling efficiency of about 67%–77% on 4096 nodes compared to an 8-node performance. 
    more » « less
  3. Abstract We present the implementation of a two-moment-based general-relativistic multigroup radiation transport module in theGeneral-relativisticmultigridnumerical (Gmunu) code. On top of solving the general-relativistic magnetohydrodynamics and the Einstein equations with conformally flat approximations, the code solves the evolution equations of the zeroth- and first-order moments of the radiations in the Eulerian-frame. An analytic closure relation is used to obtain the higher order moments and close the system. The finite-volume discretization has been adopted for the radiation moments. The advection in spatial space and frequency-space are handled explicitly. In addition, the radiation–matter interaction terms, which are very stiff in the optically thick region, are solved implicitly. The implicit–explicit Runge–Kutta schemes are adopted for time integration. We test the implementation with a number of numerical benchmarks from frequency-integrated to frequency-dependent cases. Furthermore, we also illustrate the astrophysical applications in hot neutron star and core-collapse supernovae modelings, and compare with other neutrino transport codes. 
    more » « less
  4. Abstract General relativistic magnetohydrodynamic (GRMHD) simulations have revolutionized our understanding of black hole accretion. Here, we present a GPU-accelerated GRMHD code H-AMR with multifaceted optimizations that, collectively, accelerate computation by 2–5 orders of magnitude for a wide range of applications. First, it introduces a spherical grid with 3D adaptive mesh refinement that operates in each of the three dimensions independently. This allows us to circumvent the Courant condition near the polar singularity, which otherwise cripples high-resolution computational performance. Second, we demonstrate that local adaptive time stepping on a logarithmic spherical-polar grid accelerates computation by a factor of ≲10 compared to traditional hierarchical time-stepping approaches. Jointly, these unique features lead to an effective speed of ∼109zone cycles per second per node on 5400 NVIDIA V100 GPUs (i.e., 900 nodes of the OLCF Summit supercomputer). We illustrate H-AMR's computational performance by presenting the first GRMHD simulation of a tilted thin accretion disk threaded by a toroidal magnetic field around a rapidly spinning black hole. With an effective resolution of 13,440 × 4608 × 8092 cells and a total of ≲22 billion cells and ∼0.65 × 108time steps, it is among the largest astrophysical simulations ever performed. We find that frame dragging by the black hole tears up the disk into two independently precessing subdisks. The innermost subdisk rotation axis intermittently aligns with the black hole spin, demonstrating for the first time that such long-sought alignment is possible in the absence of large-scale poloidal magnetic fields. 
    more » « less
  5. Chi-Wang Shu (Ed.)
    GPU computing is expected to play an integral part in all modern Exascale supercomputers. It is also expected that higher order Godunov schemes will make up about a significant fraction of the application mix on such supercomputers. It is, therefore, very important to prepare the community of users of higher order schemes for hyperbolic PDEs for this emerging opportunity. Not every algorithm that is used in the space-time update of the solution of hyperbolic PDEs will take well to GPUs. However, we identify a small core of algorithms that take exceptionally well to GPU computing. Based on an analysis of available options, we have been able to identify weighted essentially non-oscillatory (WENO) algorithms for spatial reconstruction along with arbitrary derivative (ADER) algorithms for time extension followed by a corrector step as the winning three-part algorithmic combination. Even when a winning subset of algorithms has been identified, it is not clear that they will port seamlessly to GPUs. The low data throughput between CPU and GPU, as well as the very small cache sizes on modern GPUs, implies that we have to think through all aspects of the task of porting an application to GPUs. For that reason, this paper identifies the techniques and tricks needed for making a successful port of this very useful class of higher order algorithms to GPUs. Application codes face a further challenge—the GPU results need to be practically indistinguishable from the CPU results—in order for the legacy knowledge bases embedded in these applications codes to be preserved during the port of GPUs. This requirement often makes a complete code rewrite impossible. For that reason, it is safest to use an approach based on OpenACC directives, so that most of the code remains intact (as long as it was originally well-written). This paper is intended to be a one-stop shop for anyone seeking to make an OpenACC-based port of a higher order Godunov scheme to GPUs. We focus on three broad and high-impact areas where higher order Godunov schemes are used. The first area is computational fluid dynamics (CFD). The second is computational magnetohydrodynamics (MHD) which has an involution constraint that has to be mimetically preserved. The third is computational electrodynamics (CED) which has involution constraints and also extremely stiff source terms. Together, these three diverse uses of higher order Godunov methodology, cover many of the most important applications areas. In all three cases, we show that the optimal use of algorithms, techniques, and tricks, along with the use of OpenACC, yields superlative speedups on GPUs. As a bonus, we find a most remarkable and desirable result: some higher order schemes, with their larger operations count per zone, show better speedup than lower order schemes on GPUs. In other words, the GPU is an optimal stratagem for overcoming the higher computational complexities of higher order schemes. Several avenues for future improvement have also been identified. A scalability study is presented for a real-world application using GPUs and comparable numbers of high-end multicore CPUs. It is found that GPUs offer a substantial performance benefit over comparable number of CPUs, especially when all the methods designed in this paper are used. 
    more » « less