skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: hIPPYlib: An Extensible Software Framework for Large-Scale Inverse Problems Governed by PDEs: Part I: Deterministic Inversion and Linearized Bayesian Inference
We present an extensible software framework, hIPPYlib, for solution of large-scale deterministic and Bayesian inverse problems governed by partial differential equations (PDEs) with (possibly) infinite-dimensional parameter fields (which are high-dimensional after discretization). hIPPYlib overcomes the prohibitively expensive nature of Bayesian inversion for this class of problems by implementing state-of-the-art scalable algorithms for PDE-based inverse problems that exploit the structure of the underlying operators, notably the Hessian of the log-posterior. The key property of the algorithms implemented in hIPPYlib is that the solution of the inverse problem is computed at a cost, measured in linearized forward PDE solves, that is independent of the parameter dimension. The mean of the posterior is approximated by the MAP point, which is found by minimizing the negative log-posterior with an inexact matrix-free Newton-CG method. The posterior covariance is approximated by the inverse of the Hessian of the negative log posterior evaluated at the MAP point. The construction of the posterior covariance is made tractable by invoking a low-rank approximation of the Hessian of the log-likelihood. Scalable tools for sample generation are also discussed. hIPPYlib makes all of these advanced algorithms easily accessible to domain scientists and provides an environment that expedites the development of new algorithms.  more » « less
Award ID(s):
1550547
PAR ID:
10275717
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM Transactions on Mathematical Software
Volume:
47
Issue:
2
ISSN:
0098-3500
Page Range / eLocation ID:
1 to 34
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bayesian inference provides a systematic framework for integration of data with mathematical models to quantify the uncertainty in the solution of the inverse problem. However, the solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an extensible and scalable software framework that contains implementations of state-of-the art algorithms aimed to overcome the challenges of high-dimensional, PDE-constrained Bayesian inverse problems. These algorithms accelerate MCMC sampling by exploiting the geometry and intrinsic low-dimensionality of parameter space via derivative information and low rank approximation. The software integrates two complementary open-source software packages, hIPPYlib and MUQ. hIPPYlib solves PDE-constrained inverse problems using automatically-generated adjoint-based derivatives, but it lacks full Bayesian capabilities. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients and Hessians to permit large-scale solution. By combining these two complementary libraries, we created a robust, scalable, and efficient software framework that realizes the benefits of each and allows us to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. To illustrate the capabilities of hIPPYlib-MUQ, we present a comparison of a number of MCMC methods available in the integrated software on several high-dimensional Bayesian inverse problems. These include problems characterized by both linear and nonlinear PDEs, various noise models, and different parameter dimensions. The results demonstrate that large (∼ 50×) speedups over conventional black box and gradient-based MCMC algorithms can be obtained by exploiting Hessian information (from the log-posterior), underscoring the power of the integrated hIPPYlib-MUQ framework. 
    more » « less
  2. Abstract Obtaining lightweight and accurate approximations of discretized objective functional Hessians in inverse problems governed by partial differential equations (PDEs) is essential to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The cubic computational complexity of dense linear algebraic tasks, such as Cholesky factorization, that provide a means to sample Gaussian distributions and determine solutions of Newton linear systems is a computational bottleneck at large-scale. These tasks can be reduced to log-linear complexity by utilizing hierarchical off-diagonal low-rank (HODLR) matrix approximations. In this work, we show that a class of Hessians that arise from inverse problems governed by PDEs are well approximated by the HODLR matrix format. In particular, we study inverse problems governed by PDEs that model the instantaneous viscous flow of ice sheets. In these problems, we seek a spatially distributed basal sliding parameter field such that the flow predicted by the ice sheet model is consistent with ice sheet surface velocity observations. We demonstrate the use of HODLR Hessian approximation to efficiently sample the Laplace approximation of the posterior distribution with covariance further approximated by HODLR matrix compression. Computational studies are performed which illustrate ice sheet problem regimes for which the Gauss–Newton data-misfit Hessian is more efficiently approximated by the HODLR matrix format than the low-rank (LR) format. We then demonstrate that HODLR approximations can be favorable, when compared to global LR approximations, for large-scale problems by studying the data-misfit Hessian associated with inverse problems governed by the first-order Stokes flow model on the Humboldt glacier and Greenland ice sheet. 
    more » « less
  3. We consider hyper-differential sensitivity analysis (HDSA) of nonlinear Bayesian inverse problems governed by partialdifferential equations (PDEs) with infinite-dimensional parameters. In previous works, HDSA has been used to assessthe sensitivity of the solution of deterministic inverse problems to additional model uncertainties and also different types of measurement data. In the present work, we extend HDSA to the class of Bayesian inverse problems governed by PDEs. The focus is on assessing the sensitivity of certain key quantities derived from the posterior distribution. Specifically, we focus on analyzing the sensitivity of the MAP point and the Bayes risk and make full use of the information embedded in the Bayesian inverse problem. After establishing our mathematical framework for HDSA of Bayesian inverse problems, we present a detailed computational approach for computing the proposed HDSA indices. We examine the effectiveness of the proposed approach on an inverse problem governed by a PDE modeling heat conduction. 
    more » « less
  4. We consider optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by partial differential equations (PDEs) under model uncertainty. Specifically, we consider inverse problems in which, in addition to the inversion parameters, the governing PDEs include secondary uncertain parameters. We focus on problems with infinite-dimensional inversion and secondary parameters and present a scalable computational framework for optimal design of such problems. The proposed approach enables Bayesian inversion and OED under uncertainty within a unified framework. We build on the Bayesian approximation error (BAE) approach, to incorporate modeling uncertainties in the Bayesian inverse problem, and methods for A-optimal design of infinite-dimensional Bayesian nonlinear inverse problems. Specifically, a Gaussian approximation to the posterior at the maximuma posterioriprobability point is used to define an uncertainty aware OED objective that is tractable to evaluate and optimize. In particular, the OED objective can be computed at a cost, in the number of PDE solves, that does not grow with the dimension of the discretized inversion and secondary parameters. The OED problem is formulated as a binary bilevel PDE constrained optimization problem and a greedy algorithm, which provides a pragmatic approach, is used to find optimal designs. We demonstrate the effectiveness of the proposed approach for a model inverse problem governed by an elliptic PDE on a three-dimensional domain. Our computational results also highlight the pitfalls of ignoring modeling uncertainties in the OED and/or inference stages. 
    more » « less
  5. null (Ed.)
    Abstract. We consider the problem of inferring the basal sliding coefficientfield for an uncertain Stokes ice sheet forward model from syntheticsurface velocity measurements. The uncertainty in the forward modelstems from unknown (or uncertain) auxiliary parameters (e.g., rheologyparameters). This inverse problem is posed within the Bayesianframework, which provides a systematic means of quantifyinguncertainty in the solution. To account for the associated modeluncertainty (error), we employ the Bayesian approximation error (BAE)approach to approximately premarginalize simultaneously over both thenoise in measurements and uncertainty in the forward model. We alsocarry out approximative posterior uncertainty quantification based ona linearization of the parameter-to-observable map centered at themaximum a posteriori (MAP) basal sliding coefficient estimate, i.e.,by taking the Laplace approximation. The MAP estimate is found byminimizing the negative log posterior using an inexact Newtonconjugate gradient method. The gradient and Hessian actions to vectorsare efficiently computed using adjoints. Sampling from theapproximate covariance is made tractable by invoking a low-rankapproximation of the data misfit component of the Hessian. We studythe performance of the BAE approach in the context of three numericalexamples in two and three dimensions. For each example, the basalsliding coefficient field is the parameter of primary interest whichwe seek to infer, and the rheology parameters (e.g., the flow ratefactor or the Glen's flow law exponent coefficient field) representso-called nuisance (secondary uncertain) parameters. Our resultsindicate that accounting for model uncertainty stemming from thepresence of nuisance parameters is crucial. Namely our findingssuggest that using nominal values for these parameters, as is oftendone in practice, without taking into account the resulting modelingerror, can lead to overconfident and heavily biased results. We alsoshow that the BAE approach can be used to account for the additionalmodel uncertainty at no additional cost at the online stage. 
    more » « less