We introduce NRPyElliptic, an elliptic solver for numerical relativity (NR) built within the NRPy+ framework. As its first application, NRPyElliptic sets up conformally flat, binary black hole (BBH) puncture initial data (ID) on a single numerical domain, similar to the widely used TwoPunctures code. Unlike TwoPunctures, NRPyElliptic employs a hyperbolic relaxation scheme, whereby arbitrary elliptic PDEs are trivially transformed into a hyperbolic system of PDEs. As consumers of NR ID generally already possess expertise in solving hyperbolic PDEs, they will generally find NRPyElliptic easier to tweak and extend than other NR elliptic solvers. When evolved forward in (pseudo)time, the hyperbolic system exponentially reaches a steady state that solves the elliptic PDEs. Notably NRPyElliptic accelerates the relaxation waves, which makes it many orders of magnitude faster than the usual constant-wavespeed approach. While it is still ∼12x slower than TwoPunctures at setting up full-3D BBH ID, NRPyElliptic requires only ≈0.3% of the runtime for a full BBH simulation in the Einstein Toolkit. Future work will focus on improving performance and generating other types of ID, such as binary neutron star.
more »
« less
INTIACC: A 32-bit Floating-Point Programmable Custom-ISA Accelerator for Solving Classes of Partial Differential Equations
We propose a numerical integration accelerator (INTIACC) that speeds up the solution of partial differential equations (PDEs) for scientific computing. In contrast to recent works, INTIACC applies to a variety of PDEs and boundary conditions, has enhanced nonlinear function capability, supports high-order integration algorithms, and uses floating-point arithmetic for orders of magnitude smaller solution error. With all the benefits, our test chip still achieves 40X speed-up over prior accelerators and orders of magnitudes over CPU and GPU based systems.
more »
« less
- Award ID(s):
- 1840763
- PAR ID:
- 10464827
- Date Published:
- Journal Name:
- ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC)
- Page Range / eLocation ID:
- 349 to 352
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mathew, Sanu (Ed.)This article presents a 32-bit floating-point (FP32) programmable accelerator for solving a wide range of partial differential equations (PDEs) based on numerical integration methods. Compared to prior works that have fixed-point systems and are only applicable to specific types of PDEs, our proposed, integration accelerator for PDEs, named INTIACC, accelerator consists of 16 locally interconnected processing elements (PEs) where each PE is a fully programmable reduced instruction set computer (RISC) processor with an FP32 arithmetic logic unit (FP32 ALU) and a custom-designed instruction set architecture (ISA). These features enable INTIACC to generate solutions with high precision and a wide dynamic range and also allow users to implement different numerical algorithms to perform high-order integration methods and to evaluate nonlinear functions. In addition, we create a novel slow-global-fast-local clocking scheme in which PEs operate asynchronously with each other most of the time. We prototype the INTIACC test chip in 65 nm, with a core area of 0.975 mm2. Running at an average local clock frequency of 570 MHz at 1 V, it offers a single-precision computation throughput of 9.12 GFLOPS. Testing results show that with a similar energy-delay product, INTIACC is up to 40× faster than the prior state-of-the-art PDE solver.more » « less
-
Abstract Adaptive mesh refinement (AMR) is the art of solving PDEs on a mesh hierarchy with increasing mesh refinement at each level of the hierarchy. Accurate treatment on AMR hierarchies requires accurate prolongation of the solution from a coarse mesh to a newly defined finer mesh. For scalar variables, suitably high-order finite volume WENO methods can carry out such a prolongation. However, classes of PDEs, such as computational electrodynamics (CED) and magnetohydrodynamics (MHD), require that vector fields preserve a divergence constraint. The primal variables in such schemes consist of normal components of the vector field that are collocated at the faces of the mesh. As a result, the reconstruction and prolongation strategies for divergence constraint-preserving vector fields are necessarily more intricate. In this paper we present a fourth-order divergence constraint-preserving prolongation strategy that is analytically exact. Extension to higher orders using analytically exact methods is very challenging. To overcome that challenge, a novel WENO-like reconstruction strategy is invented that matches the moments of the vector field in the faces, where the vector field components are collocated. This approach is almost divergence constraint-preserving, therefore, we call it WENO-ADP. To make it exactly divergence constraint-preserving, a touch-up procedure is developed that is based on a constrained least squares (CLSQ) method for restoring the divergence constraint up to machine accuracy. With the touch-up, it is called WENO-ADPT. It is shown that refinement ratios of two and higher can be accommodated. An item of broader interest in this work is that we have also been able to invent very efficient finite volume WENO methods, where the coefficients are very easily obtained and the multidimensional smoothness indicators can be expressed as perfect squares. We demonstrate that the divergence constraint-preserving strategy works at several high orders for divergence-free vector fields as well as vector fields, where the divergence of the vector field has to match a charge density and its higher moments. We also show that our methods overcome the late time instability that has been known to plague adaptive computations in CED.more » « less
-
Bayesian inference provides a systematic framework for integration of data with mathematical models to quantify the uncertainty in the solution of the inverse problem. However, the solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an extensible and scalable software framework that contains implementations of state-of-the art algorithms aimed to overcome the challenges of high-dimensional, PDE-constrained Bayesian inverse problems. These algorithms accelerate MCMC sampling by exploiting the geometry and intrinsic low-dimensionality of parameter space via derivative information and low rank approximation. The software integrates two complementary open-source software packages, hIPPYlib and MUQ. hIPPYlib solves PDE-constrained inverse problems using automatically-generated adjoint-based derivatives, but it lacks full Bayesian capabilities. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients and Hessians to permit large-scale solution. By combining these two complementary libraries, we created a robust, scalable, and efficient software framework that realizes the benefits of each and allows us to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. To illustrate the capabilities of hIPPYlib-MUQ, we present a comparison of a number of MCMC methods available in the integrated software on several high-dimensional Bayesian inverse problems. These include problems characterized by both linear and nonlinear PDEs, various noise models, and different parameter dimensions. The results demonstrate that large (∼ 50×) speedups over conventional black box and gradient-based MCMC algorithms can be obtained by exploiting Hessian information (from the log-posterior), underscoring the power of the integrated hIPPYlib-MUQ framework.more » « less
-
It is urgently desired yet challenging to synthesize porous graphitic carbon (PGC) in a bottom-up manner while circumventing the need for high-temperature pyrolysis. Here we present an effective and scalable strategy to synthesize PGC through acid-mediated aldol triple condensation followed by low-temperature graphitization. The deliberate structural design enables its graphitization in situ in solution and at low pyrolysis temperature. The resulting material features ultramicroporosity characterized by a sharp pore size distribution. In addition, the pristine homogeneous composition of the reaction mixture allows for solution-processability of the material for further characterization and applications. Thin films of this PGC exhibit several orders of magnitude higher electrical conductivity compared to analogous control materials that are carbonized at the same temperatures. The integration of low-temperature graphitization and solution-processability not only allows for an energy-efficient method for the production and fabrication of PGC, but also paves the way for its wider employment in applications such as electrocatalysis, sensing, and energy storage.more » « less