skip to main content


This content will become publicly available on September 30, 2024

Title: Algorithms for Parallel Generic hp -Adaptive Finite Element Software

Thehp-adaptive finite element method—where one independently chooses the mesh size (h) and polynomial degree (p) to be used on each cell—has long been known to have better theoretical convergence properties than eitherh- orp-adaptive methods alone. However, it is not widely used, owing at least in part to the difficulty of the underlying algorithms and the lack of widely usable implementations. This is particularly true when used with continuous finite elements.

Herein, we discuss algorithms that are necessary for a comprehensive and generic implementation ofhp-adaptive finite element methods on distributed-memory, parallel machines. In particular, we will present a multistage algorithm for the unique enumeration of degrees of freedom suitable for continuous finite element spaces, describe considerations for weighted load balancing, and discuss the transfer of variable size data between processes. We illustrate the performance of our algorithms with numerical examples and demonstrate that they scale reasonably up to at least 16,384 message passage interface processes.

We provide a reference implementation of our algorithms as part of the open source librarydeal.II.

 
more » « less
Award ID(s):
1821210
NSF-PAR ID:
10490639
Author(s) / Creator(s):
;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Mathematical Software
Volume:
49
Issue:
3
ISSN:
0098-3500
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. SUMMARY

    Combining finite element methods for the incompressible Stokes equations with particle-in-cell methods is an important technique in computational geodynamics that has been widely applied in mantle convection, lithosphere dynamics and crustal-scale modelling. In these applications, particles are used to transport along properties of the medium such as the temperature, chemical compositions or other material properties; the particle methods are therefore used to reduce the advection equation to an ordinary differential equation for each particle, resulting in a problem that is simpler to solve than the original equation for which stabilization techniques are necessary to avoid oscillations.

    On the other hand, replacing field-based descriptions by quantities only defined at the locations of particles introduces numerical errors. These errors have previously been investigated, but a complete understanding from both the theoretical and practical sides was so far lacking. In addition, we are not aware of systematic guidance regarding the question of how many particles one needs to choose per mesh cell to achieve a certain accuracy.

    In this paper we modify two existing instantaneous benchmarks and present two new analytic benchmarks for time-dependent incompressible Stokes flow in order to compare the convergence rate and accuracy of various combinations of finite elements, particle advection and particle interpolation methods. Using these benchmarks, we find that in order to retain the optimal accuracy of the finite element formulation, one needs to use a sufficiently accurate particle interpolation algorithm. Additionally, we observe and explain that for our higher-order finite-element methods it is necessary to increase the number of particles per cell as the mesh resolution increases (i.e. as the grid cell size decreases) to avoid a reduction in convergence order.

    Our methods and results allow designing new particle-in-cell methods with specific convergence rates, and also provide guidance for the choice of common building blocks and parameters such as the number of particles per cell. In addition, our new time-dependent benchmark provides a simple test that can be used to compare different implementations, algorithms and for the assessment of new numerical methods for particle interpolation and advection. We provide a reference implementation of this benchmark in aspect (the ‘Advanced Solver for Problems in Earth’s ConvecTion’), an open source code for geodynamic modelling.

     
    more » « less
  2. Abstract

    As the use of spectral/hpelement methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendous attention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/hpelement methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/hpelement libraryNektar++by running a series of baseline evaluations against the ‘standard’ Lagrangian method, where an interpolation matrix is generated and matrix-multiplication applied to evaluate a point at a given location. We present results from a rigorous series of benchmarking tests for a variety of element shapes, polynomial orders and dimensions. We show that when the point of interest is to be repeatedly evaluated, the barycentric method performs at worst$$50\%$$50%slower, when compared to a cached matrix evaluation. However, when the point of interest changes repeatedly so that the interpolation matrix must be regenerated in the ‘standard’ approach, the barycentric method yields far greater performance, with a minimum speedup factor of$$7\times $$7×. Furthermore, when derivatives of the solution evaluation are also required, the barycentric method in general slightly outperforms the cached interpolation matrix method across all elements and orders, with an up to$$30\%$$30%speedup. Finally we investigate a real-world example of scalar transport using a non-conformal discontinuous Galerkin simulation, in which we observe around$$6\times $$6×speedup in computational time for the barycentric method compared to the matrix-based approach. We also explore the complexity of both interpolation methods and show that the barycentric interpolation method requires$${\mathcal {O}}(k)$$O(k)storage compared to a best case space complexity of$${\mathcal {O}}(k^2)$$O(k2)for the Lagrangian interpolation matrix method.

     
    more » « less
  3. This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening (both h -multigrid), and polynomial global coarsening (a variant of p -multigrid). We have integrated the algorithms into the same framework—the open source finite-element library deal.II —, which allows us to make fair comparisons regarding their implementation complexity, computational efficiency, and parallel scalability as well as to compare the measurements with theoretically derived performance metrics. Serial simulations and parallel weak and strong scaling on up to 147,456 CPU cores on 3,072 compute nodes are presented. The results obtained indicate that global-coarsening algorithms show a better parallel behavior for comparable smoothers due to the better load balance, particularly on the expensive fine levels. In the serial case, the costs of applying hanging-node constraints might be significant, leading to advantages of local smoothing, even though the number of solver iterations needed is slightly higher. When using p - and h -multigrid in sequence ( hp -multigrid), the results indicate that it makes sense to decrease the degree of the elements first from a performance point of view due to the cheaper transfer. 
    more » « less
  4. Abstract

    We present an approach for the inclusion of nonspherical constituents in high-resolutionN-body discrete element method (DEM) simulations. We use aggregates composed of bonded spheres to model nonspherical components. Though the method may be applied more generally, we detail our implementation in the existingN-body codepkdgrav. It has long been acknowledged that nonspherical grains confer additional shear strength and resistance to flow when compared with spheres. As a result, we expect that rubble-pile asteroids will also exhibit these properties and may behave differently than comparable rubble piles composed of idealized spheres. Since spherical particles avoid some significant technical challenges, most DEM gravity codes have used only spherical particles or have been confined to relatively low resolutions. We also discuss the work that has gone into improving performance with nonspherical grains, building onpkdgrav's existing leading-edge computational efficiency among DEM gravity codes. This allows for the addition of nonspherical shapes while maintaining the efficiencies afforded bypkdgrav's tree implementation and parallelization. As a test, we simulated the gravitational collapse of 25,000 nonspherical bodies in parallel. In this case, the efficiency improvements allowed for an increase in speed by nearly a factor of 3 when compared with the naive implementation. Without these enhancements, large runs with nonspherical components would remain prohibitively expensive. Finally, we present the results of several small-scale tests: spin-up due to the YORP effect, tidal encounters, and the Brazil nut effect. In all cases, we find that the inclusion of nonspherical constituents has a measurable impact on simulation outcomes.

     
    more » « less
  5. Abstract

    Hexagonal boron nitride (h‐BN) is a layered inorganic synthetic crystal exhibiting high temperature stability and high thermal conductivity. As a ceramic material it has been widely used for thermal management, heat shielding, lubrication, and as a filler material for structural composites. Recent scientific advances in isolating atomically thin monolayers from layered van der Waals crystals to study their unique properties has propelled research interest in mono/few layeredh‐BN as a wide bandgap insulating support for nanoscale electronics, tunnel barriers, communications, neutron detectors, optics, sensing, novel separations, quantum emission from defects, among others. Realizing these futuristic applications hinges on scalable cost‐effective high‐qualityh‐BN synthesis. Here, the authors review scalable approaches of high‐quality mono/multilayerh‐BN synthesis, discuss the challenges and opportunities for each method, and contextualize their relevance to emerging applications. Maintaining a stoichiometric balance B:N = 1 as the atoms incorporate into the growing layered crystal and maintaining stacking order between layers during multi‐layer synthesis emerge as some of the main challenges forh‐BN synthesis and the development of processes to address these aspects can inform and guide the synthesis of other layered materials with more than one constituent element. Finally, the authors contextualizeh‐BN synthesis efforts along with quality requirements for emerging applications via a technological roadmap.

     
    more » « less