NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automating GPU Scalability for Complex Scientific Models: Phonon Boltzmann Transport Equation

https://doi.org/10.1109/IPDPS57955.2024.00045

Heisler, Eric; Saurav, Siddharth; Deshmukh, Aadesh; Mazumder, Sandip; Sundar, Hari (May 2024, IEEE)

Heterogeneous computing environments combining CPU and GPU resources provide a great boost to large-scale scientific computing applications. Code generation utilities that partition the work into CPU and GPU tasks while considering data movement costs allow researchers to develop high-performance solutions more quickly and easily, and make these resources accessible to a larger user base.We present developments for a domain-specific language (DSL) and code generation framework for solving partial differential equations (PDEs). These enhancements facilitate GPU-accelerated solution of the Boltzmann transport equation (BTE) for phonons, which is the governing equation for simulating thermal transport in semiconductor materials at sub-micron scales. The solution of the BTE involves thousands of coupled PDEs as well as complicated boundary conditions and solving a nonlinear equation that couples all of the degrees of freedom at each time step. These developments enable the DSL to generate configurable hybrid GPU/CPU code that couples accelerated kernels with user-defined code. We observed performance improvements of around 18X compared to a CPU-only version produced by this same DSL with minimal additional programming effort.
more » « less
Full Text Available
Localization landscape of optical waves in multifractal photonic membranes

https://doi.org/10.1364/OME.520201

Shubitidze, Tornike; Zhu, Yilin; Sundar, Hari; Dal_Negro, Luca (March 2024, Optical Materials Express)

In this paper, we investigate the localization properties of optical waves in disordered systems with multifractal scattering potentials. In particular, we apply the localization landscape theory to the classical Helmholtz operator and, without solving the associated eigenproblem, show accurate predictions of localized eigenmodes for one- and two-dimensional multifractal structures. Finally, we design and fabricate nanoperforated photonic membranes in silicon nitride (SiN) and image directly their multifractal modes using leaky-mode spectroscopy in the visible spectral range. The measured data demonstrate optical resonances with multiscale intensity fluctuations in good qualitative agreement with numerical simulations. The proposed approach provides a convenient strategy to design multifractal photonic membranes, enabling rapid exploration of extended scattering structures with tailored disorder for enhanced light-matter interactions.
more » « less
An autoencoder compression approach for accelerating large-scale inverse problems

https://doi.org/10.1088/1361-6420/acfbe1

Wittmer, Jonathan; Badger, Jacob; Sundar, Hari; Bui-Thanh, Tan (October 2023, Inverse Problems)
N/A (Ed.)
Abstract Partial differential equation (PDE)-constrained inverse problems are some of the most challenging and computationally demanding problems in computational science today. Fine meshes required to accurately compute the PDE solution introduce an enormous number of parameters and require large-scale computing resources such as more processors and more memory to solve such systems in a reasonable time. For inverse problems constrained by time-dependent PDEs, the adjoint method often employed to compute gradients and higher order derivatives efficiently requires solving a time-reversed, so-called adjoint PDE that depends on the forward PDE solution at each timestep. This necessitates the storage of a high-dimensional forward solution vector at every timestep. Such a procedure quickly exhausts the available memory resources. Several approaches that trade additional computation for reduced memory footprint have been proposed to mitigate the memory bottleneck, including checkpointing and compression strategies. In this work, we propose a close-to-ideal scalable compression approach using autoencoders to eliminate the need for checkpointing and substantial memory storage, thereby reducing the time-to-solution and memory requirements. We compare our approach with checkpointing and an off-the-shelf compression approach on an earth-scale ill-posed seismic inverse problem. The results verify the expected close-to-ideal speedup for the gradient and Hessian-vector product using the proposed autoencoder compression approach. To highlight the usefulness of the proposed approach, we combine the autoencoder compression with the data-informed active subspace (DIAS) prior showing how the DIAS method can be affordably extended to large-scale problems without the need for checkpointing and large memory.
more » « less
Full Text Available
Scalable parallelization for the solution of phonon Boltzmann Transport Equation

https://doi.org/10.1145/3577193.3593723

Tran, Han D.; Saurav, Siddharth; Sadayappan, P.; Mazumder, Sandip; Sundar, Hari (June 2023, ACM)

The Boltzmann Transport Equation (BTE) for phonons is often used to predict thermal transport at submicron scales in semiconductors. The BTE is a seven-dimensional nonlinear integro-differential equation, resulting in difficulty in its solution even after linearization under the single relaxation time approximation. Furthermore, parallelization and load balancing are challenging, given the high dimensionality and variability of the linear systems' conditioning. This work presents a 'synthetic' scalable parallelization method for solving the BTE on large-scale systems. The method includes cell-based parallelization, combined band+cell-based parallelization, and batching technique. The essential computational ingredient of cell-based parallelization is a sparse matrix-vector product (SpMV) that can be integrated with an existing linear algebra library like PETSc. The combined approach enhances the cell-based method by further parallelizing the band dimension to take advantage of low inter-band communication costs. For the batched approach, we developed a batched SpMV that enables multiple linear systems to be solved simultaneously, merging many MPI messages to reduce communication costs, thus maintaining scalability when the grain size becomes very small. We present numerical experiments to demonstrate our method's excellent speedups and scalability up to 16384 cores for a problem with 12.6 billion unknowns.
more » « less
Full Text Available
Scalable adaptive algorithms for next-generation multiphase flow simulations

https://doi.org/10.1109/IPDPS54959.2023.00065

Saurabh, Kumar; Ishii, Masado; Khanwale, Makrand A.; Sundar, Hari; Ganapathysubramanian, Baskar (May 2023, 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS))

Full Text Available
Multi-discretization domain specific language and code generation for differential equations

https://doi.org/10.1016/j.jocs.2023.101981

Heisler, Eric; Deshmukh, Aadesh; Mazumder, Sandip; Sadayappan, Ponnuswamy; Sundar, Hari (April 2023, Journal of computational science)

Finch, a domain specific language and code generation framework for partial differential equations (PDEs), is demonstrated here to solve two classical problems: steady-state advection diffusion equation (single PDE) and the phonon Boltzmann transport equation (coupled PDEs). Both finite volume and finite element methods are explored. In addition to work presented at the 2022 International Conference on Computational Science (Heisler et al., 2022), we include recent developments for solving nonlinear equations using both automatic and symbolic differentiation, and demonstrate the capability for the Bratu (nonlinear Poisson) equation.
more » « less
Full Text Available
Massively parallel simulations of binary black holes with adaptive wavelet multiresolution

https://doi.org/10.1103/PhysRevD.107.064035

Fernando, Milinda; Neilsen, David; Zlochower, Yosef; Hirschmann, Eric W.; Sundar, Hari (March 2023, Physical Review D)

Full Text Available
A projection-based, semi-implicit time-stepping approach for the Cahn-Hilliard Navier-Stokes equations on adaptive octree meshes

https://doi.org/10.1016/j.jcp.2022.111874

Khanwale, Makrand A.; Saurabh, Kumar; Ishii, Masado; Sundar, Hari; Rossmanith, James A.; Ganapathysubramanian, Baskar (February 2023, Journal of Computational Physics)

Full Text Available
A Domain Specific Language Applied to Phonon Boltzmann Transport for Heat Conduction

https://doi.org/10.1115/IMECE2022-95034

Heisler, Eric; Saurav, Siddharth; Deshmukh, Aadesh; Mazumder, Sandip; Sadayappan, Ponnuswamy; Sundar, Hari (October 2022, ASME International Mechanical Engineering Congress and Exposition)

The phonon Boltzmann transport equation is a good model for heat transfer in nanometer scale structures such as semiconductor devices. Computational complexity is one of the main challenges in numerically solving this set of potentially thousands of nonlinearly coupled equations. Writing efficient code will involve careful optimization and choosing an effective parallelization strategy, requiring expertise in high performance computing, mathematical methods, and thermal physics. To address this challenge, we present the domain specific language and code generation software Finch. This language allows a domain scientist to enter the equations in a simple format, provide only basic mathematical functions used in the model, and generate efficient parallel code. Even very complex systems of equations such as phonon Boltzmann transport can be entered in a very simple, intuitive way. A feature of the framework is flexibility in numerical methods, computing environments, parallel strategies, and other aspects of the generated code. We demonstrate Finch on this problem using a variety of parallel strategies and model configurations to demonstrate the flexibility and ease of use.
more » « less
Full Text Available
Finch: Domain Specific Language and Code Generation for Finite Element and Finite Volume in Julia

https://doi.org/10.1007/978-3-031-08751-6_9

Heisler, Eric; Deshmukh, Aadesh; Sundar, Hari (June 2022, Computational Science -- ICCS 2022)

We introduce FINCH, a Julia-based domain specific language (DSL) for solving partial differential equations in a discretization agnostic way, currently including finite element and finite volume methods. A key focus is code generation for various internal or external software targets. Internal targets use a modular set of tools in Julia providing a direct solution within the framework. In contrast, external code generation produces a set of code files to be compiled and run with external libraries or frameworks. Examples include a matlab target, for smaller problems or prototyping, or C++/MPI based targets for larger problems needing scalability. This allows us to take advantage of their capabilities without needlessly duplicating them, and provides options tailored to the needs of the domain scientist. The modular design of FINCH allows ongoing development of these target modules resulting in a more extensible framework and a broader set of applications. The support for multiple discretizations, including finite element and finite volume methods, also contributes to this goal. Another focus of this project is complex systems containing a large set of coupled PDEs that could be challenging to efficiently code and optimize by hand, but that are relatively simple to specify using the DSL. In this paper we present the key features of FINCH that set it apart from many other DSL options, and demonstrate the basic usage and current capabilities through examples.
more » « less
Full Text Available

« Prev Next »

Search for: All records