Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Heterogeneous computing environments combining CPU and GPU resources provide a great boost to large-scale scientific computing applications. Code generation utilities that partition the work into CPU and GPU tasks while considering data movement costs allow researchers to develop high-performance solutions more quickly and easily, and make these resources accessible to a larger user base.We present developments for a domain-specific language (DSL) and code generation framework for solving partial differential equations (PDEs). These enhancements facilitate GPU-accelerated solution of the Boltzmann transport equation (BTE) for phonons, which is the governing equation for simulating thermal transport in semiconductor materials at sub-micron scales. The solution of the BTE involves thousands of coupled PDEs as well as complicated boundary conditions and solving a nonlinear equation that couples all of the degrees of freedom at each time step. These developments enable the DSL to generate configurable hybrid GPU/CPU code that couples accelerated kernels with user-defined code. We observed performance improvements of around 18X compared to a CPU-only version produced by this same DSL with minimal additional programming effort.more » « lessFree, publicly-accessible full text available May 27, 2025
-
The Boltzmann Transport Equation (BTE) for phonons is often used to predict thermal transport at submicron scales in semiconductors. The BTE is a seven-dimensional nonlinear integro-differential equation, resulting in difficulty in its solution even after linearization under the single relaxation time approximation. Furthermore, parallelization and load balancing are challenging, given the high dimensionality and variability of the linear systems' conditioning. This work presents a 'synthetic' scalable parallelization method for solving the BTE on large-scale systems. The method includes cell-based parallelization, combined band+cell-based parallelization, and batching technique. The essential computational ingredient of cell-based parallelization is a sparse matrix-vector product (SpMV) that can be integrated with an existing linear algebra library like PETSc. The combined approach enhances the cell-based method by further parallelizing the band dimension to take advantage of low inter-band communication costs. For the batched approach, we developed a batched SpMV that enables multiple linear systems to be solved simultaneously, merging many MPI messages to reduce communication costs, thus maintaining scalability when the grain size becomes very small. We present numerical experiments to demonstrate our method's excellent speedups and scalability up to 16384 cores for a problem with 12.6 billion unknowns.more » « less
-
Finch, a domain specific language and code generation framework for partial differential equations (PDEs), is demonstrated here to solve two classical problems: steady-state advection diffusion equation (single PDE) and the phonon Boltzmann transport equation (coupled PDEs). Both finite volume and finite element methods are explored. In addition to work presented at the 2022 International Conference on Computational Science (Heisler et al., 2022), we include recent developments for solving nonlinear equations using both automatic and symbolic differentiation, and demonstrate the capability for the Bratu (nonlinear Poisson) equation.more » « less
-
The phonon Boltzmann transport equation is a good model for heat transfer in nanometer scale structures such as semiconductor devices. Computational complexity is one of the main challenges in numerically solving this set of potentially thousands of nonlinearly coupled equations. Writing efficient code will involve careful optimization and choosing an effective parallelization strategy, requiring expertise in high performance computing, mathematical methods, and thermal physics. To address this challenge, we present the domain specific language and code generation software Finch. This language allows a domain scientist to enter the equations in a simple format, provide only basic mathematical functions used in the model, and generate efficient parallel code. Even very complex systems of equations such as phonon Boltzmann transport can be entered in a very simple, intuitive way. A feature of the framework is flexibility in numerical methods, computing environments, parallel strategies, and other aspects of the generated code. We demonstrate Finch on this problem using a variety of parallel strategies and model configurations to demonstrate the flexibility and ease of use.more » « less