skip to main content

Search for: All records

Creators/Authors contains: "Sundar, Hari"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Numerically solving partial differential equations (PDEs) remains a compelling application of supercomputing resources. The next generation of computing resources - exhibiting increased parallelism and deep memory hierarchies - provide an opportunity to rethink how to solve PDEs, especially time dependent PDEs. Here, we consider time as an additional dimension and simultaneously solve for the unknown in large blocks of time (i.e. in 4D space-time), instead of the standard approach of sequential time-stepping. We discretize the 4D space-time domain using a mesh-free kD tree construction that enables good parallel performance as well as on-the-fly construction of adaptive 4D meshes. To best use the 4D space-time mesh adaptivity, we invoke concepts from PDE analysis to establish rigorous a posteriori error estimates for a general class of PDEs. We solve canonical linear as well as non-linear PDEs (heat diffusion, advection-diffusion, and Allen-Cahn) in space-time, and illustrate the following advantages: (a) sustained scaling behavior across a larger processor count compared to sequential time-stepping approaches, (b) the ability to capture "localized" behavior in space and time using the adaptive space-time mesh, and (c) removal of any time-stepping constraints like the Courant-Friedrichs-Lewy (CFL) condition, as well as the ability to utilize spatially varying time-steps. We believemore »that the algorithmic and mathematical developments along with efficient deployment on modern architectures shown in this work constitute an important step towards improving the scalability of PDE solvers on the next generation of supercomputers.« less
  2. Multigrid is one of the most effective methods for solving elliptic PDEs. It is algorithmically optimal and is robust when combined with Krylov methods. Algebraic multigrid is especially attractive due to its blackbox nature. This however comes at the cost of increased setup costs that can be significant in case of systems where the system matrix changes frequently making it difficult to amortize the setup cost. In this work, we investigate several strategies for performing lazy updates to the multigrid hierarchy corresponding to changes in the system matrix. These include delayed updates, value updates without changing structure, process local changes, and full updates. We demonstrate that in many cases, the overhead of building the AMG hierarchy can be mitigated for rapidly changing system matrices.
  3. We present a portable and highly-scalable framework that targets problems in the astrophysics and numerical relativity communities. This framework combines together the parallel Dendro octree with wavelet adaptive multiresolution and an automatic code-generation physics module to solve the Einstein equations of general relativity in the BSSNOK formulation. The goal of this work is to perform advanced, massively parallel numerical simulations of binary black hole and neutron star mergers, including Intermediate Mass Ratio Inspirals (IMRIs) of binary black holes with mass ratios on the order of 100:1. These studies will be used to study waveforms for use in LIGO data analysis and to calibrate approximate methods for generating gravitational waveforms. The key contribution of this work is the development of automatic code generators for computational relativity supporting SIMD vectorization, OpenMP, and CUDA combined with efficient distributed memory adaptive data-structures. These have enabled the development of efficient codes that demonstrate excellent weak scalability up to 131K cores on ORNL's Titan for binary mergers for mass ratios up to 100.