skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 9:30 PM ET on Friday, January 23 until 7:00 AM ET on Saturday, January 24 due to maintenance. We apologize for the inconvenience.


Title: Efficient Distributed Matrix-free Multigrid Methods on Locally Refined Meshes for FEM Computations
This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening (both h -multigrid), and polynomial global coarsening (a variant of p -multigrid). We have integrated the algorithms into the same framework—the open source finite-element library deal.II —, which allows us to make fair comparisons regarding their implementation complexity, computational efficiency, and parallel scalability as well as to compare the measurements with theoretically derived performance metrics. Serial simulations and parallel weak and strong scaling on up to 147,456 CPU cores on 3,072 compute nodes are presented. The results obtained indicate that global-coarsening algorithms show a better parallel behavior for comparable smoothers due to the better load balance, particularly on the expensive fine levels. In the serial case, the costs of applying hanging-node constraints might be significant, leading to advantages of local smoothing, even though the number of solver iterations needed is slightly higher. When using p - and h -multigrid in sequence ( hp -multigrid), the results indicate that it makes sense to decrease the degree of the elements first from a performance point of view due to the cheaper transfer.  more » « less
Award ID(s):
1925575 2015848 2028346
PAR ID:
10429815
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Parallel Computing
Volume:
10
Issue:
1
ISSN:
2329-4949
Page Range / eLocation ID:
1 to 38
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We present the design and implementation details of a geometric multigrid method on adaptively refined meshes for massively parallel computations. The method uses local smoothing on the refined part of the mesh. Partitioning is achieved by using a space filling curve for the leaf mesh and distributing ancestors in the hierarchy based on the leaves. We present a model of the efficiency of mesh hierarchy distribution and compare its predictions to runtime measurements. The algorithm is implemented as part of the deal.II finite-element library and as such available to the public. 
    more » « less
  2. We present the lowest-order hybridizable discontinuous Galerkin schemes with numerical integration (quadrature), denoted as HDG-P0 for the reaction-diffusion equation and the generalized Stokes equations on conforming simplicial meshes in two- and three-dimensions. Here by lowest order, we mean that the (hybrid) finite element space for the global HDG facet degrees of freedom (DOFs) is the space of piecewise constants on the mesh skeleton. A discontinuous piecewise linear space is used for the approximation of the local primal unknowns. We give the optimal a priori error analysis of the proposed HDG-P0 schemes, which hasn’t appeared in the literature yet for HDG discretizations as far as numerical integration is concerned. Moreover, we propose optimal geometric multigrid preconditioners for the statically condensed HDG-P0 linear systems on conforming simplicial meshes. In both cases, we first establish the equivalence of the statically condensed HDG system with a (slightly modified) nonconforming Crouzeix–Raviart (CR) discretization, where the global (piecewise-constant) HDG finite element space on the mesh skeleton has a natural one-to-one correspondence to the nonconforming CR (piecewise-linear) finite element space that live on the whole mesh. This equivalence then allows us to use the well-established nonconforming geometry multigrid theory to precondition the condensed HDG system. Numerical results in two- and three-dimensions are presented to verify our theoretical findings. 
    more » « less
  3. We devise multigrid preconditioners for linear-quadratic space-time distributed parabolic optimal control problems. While our method is rooted in earlier work on elliptic control, the temporal dimension presents new challenges in terms of algorithm design and quality. Our primary focus is on the cG(s)dG(r) discretizations which are based on functions that are continuous in space and discontinuous in time, but our technique is applicable to various other space-time finite element discretizations. We construct and analyse two kinds of multigrid preconditioners: the first is based on full coarsening in space and time, while the second is based on semi-coarsening in space only. Our analysis, in conjunction with numerical experiments, shows that both preconditioners are of optimal order with respect to the discretization in case of cG(1)dG(r) for r = 0, 1 and exhibit a suboptimal behaviour in time for Crank–Nicolson. We also show that, under certain conditions, the preconditioner using full space-time coarsening is more efficient than the one involving semi-coarsening in space, a phenomenon that has not been observed previously. Our numerical results confirm the theoretical findings. 
    more » « less
  4. na (Ed.)
    Thehp-adaptive finite element method—where one independently chooses the mesh size (h) and polynomial degree (p) to be used on each cell—has long been known to have better theoretical convergence properties than eitherh- orp-adaptive methods alone. However, it is not widely used, owing at least in part to the difficulty of the underlying algorithms and the lack of widely usable implementations. This is particularly true when used with continuous finite elements. Herein, we discuss algorithms that are necessary for a comprehensive and generic implementation ofhp-adaptive finite element methods on distributed-memory, parallel machines. In particular, we will present a multistage algorithm for the unique enumeration of degrees of freedom suitable for continuous finite element spaces, describe considerations for weighted load balancing, and discuss the transfer of variable size data between processes. We illustrate the performance of our algorithms with numerical examples and demonstrate that they scale reasonably up to at least 16,384 message passage interface processes. We provide a reference implementation of our algorithms as part of the open source librarydeal.II. 
    more » « less
  5. Abstract Immersed finite element methods provide a convenient analysis framework for problems involving geometrically complex domains, such as those found in topology optimization and microstructures for engineered materials. However, their implementation remains a major challenge due to, among other things, the need to apply nontrivial stabilization schemes and generate custom quadrature rules. This article introduces the robust and computationally efficient algorithms and data structures comprising an immersed finite element preprocessing framework. The input to the preprocessor consists of a background mesh and one or more geometries defined on its domain. The output is structured into groups of elements with custom quadrature rules formatted such that common finite element assembly routines may be used without or with only minimal modifications. The key to the preprocessing framework is the construction of material topology information, concurrently with the generation of a quadrature rule, which is then used to perform enrichment and generate stabilization rules. While the algorithmic framework applies to a wide range of immersed finite element methods using different types of meshes, integration, and stabilization schemes, the preprocessor is presented within the context of the extended isogeometric analysis. This method utilizes a structured B-spline mesh, a generalized Heaviside enrichment strategy considering the material layout within individual basis functions’ supports, and face-oriented ghost stabilization. Using a set of examples, the effectiveness of the enrichment and stabilization strategies is demonstrated alongside the preprocessor’s robustness in geometric edge cases. Additionally, the performance and parallel scalability of the implementation are evaluated. 
    more » « less