skip to main content

Title: Sparse Bayesian mass mapping with uncertainties: local credible intervals

Until recently, mass-mapping techniques for weak gravitational lensing convergence reconstruction have lacked a principled statistical framework upon which to quantify reconstruction uncertainties, without making strong assumptions of Gaussianity. In previous work, we presented a sparse hierarchical Bayesian formalism for convergence reconstruction that addresses this shortcoming. Here, we draw on the concept of local credible intervals (cf. Bayesian error bars) as an extension of the uncertainty quantification techniques previously detailed. These uncertainty quantification techniques are benchmarked against those recovered via Px-MALA – a state-of-the-art proximal Markov chain Monte Carlo (MCMC) algorithm. We find that, typically, our recovered uncertainties are everywhere conservative (never underestimate the uncertainty, yet the approximation error is bounded above), of similar magnitude and highly correlated with those recovered via Px-MALA. Moreover, we demonstrate an increase in computational efficiency of $\mathcal {O}(10^6)$ when using our sparse Bayesian approach over MCMC techniques. This computational saving is critical for the application of Bayesian uncertainty quantification to large-scale stage IV surveys such as LSST and Euclid.

 ;  ;  ;  ;  ;
Publication Date:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Page Range or eLocation-ID:
p. 394-404
Oxford University Press
Sponsoring Org:
National Science Foundation
More Like this
  1. Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate- dimensional problems. Our key insight is that many hierarchical models of practical interest admit a Gaussian process representation such that rather than maintaining a posterior over all O(p^2) interactions, we need only maintain a vector of O(p) kernel hyper-parameters. Thismore »implicit representation allows us to run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.« less
  2. Many Markov Chain Monte Carlo (MCMC) methods leverage gradient information of the potential function of target distribution to explore sample space efficiently. However, computing gradients can often be computationally expensive for large scale applications, such as those in contemporary machine learning. Stochastic Gradient (SG-)MCMC methods approximate gradients by stochastic ones, commonly via uniformly subsampled data points, and achieve improved computational efficiency, however at the price of introducing sampling error. We propose a non-uniform subsampling scheme to improve the sampling accuracy. The proposed exponentially weighted stochastic gradient (EWSG) is designed so that a non-uniform-SG-MCMC method mimics the statistical behavior ofmore »a batch-gradient-MCMC method, and hence the inaccuracy due to SG approximation is reduced. EWSG differs from classical variance reduction (VR) techniques as it focuses on the entire distribution instead of just the variance; nevertheless, its reduced local variance is also proved. EWSG can also be viewed as an extension of the importance sampling idea, successful for stochastic-gradient-based optimizations, to sampling tasks. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index, which is coupled to the MCMC algorithm. Numerical experiments are provided, not only to demonstrate EWSG's effectiveness, but also to guide hyperparameter choices, and validate our non-asymptotic global error bound despite of approximations in the implementation. Notably, while statistical accuracy is improved, convergence speed can be comparable to the uniform version, which renders EWSG a practical alternative to VR (but EWSG and VR can be combined too).

    « less
  3. Dielectric elastomers are employed for a wide variety of adaptive structures. Many of these soft elastomers exhibit significant rate-dependencies in their response. Accurately quantifying this viscoelastic behavior is non-trivial and in many cases a nonlinear modeling framework is required. Fractional-order operators have been applied to modeling viscoelastic behavior for many years, and recent research has shown fractional-order methods to be effective for nonlinear frameworks. This implementation can become computationally expensive to achieve an accurate approximation of the fractional-order derivative. Accurate estimation of the elastomer’s viscoelastic behavior to quantify parameter uncertainty motivates the use of Markov Chain Monte Carlo (MCMC) methods.more »Since MCMC is a sampling based method, requiring many model evaluations, efficient estimation of the fractional derivative operator is crucial. In this paper, we demonstrate the effectiveness of using quadrature techniques to approximate the Riemann–Liouville definition for fractional derivatives in the context of estimating the uncertainty of a nonlinear viscoelastic model. We also demonstrate the use of parameter subset selection techniques to isolate parameters that are identifiable in the sense that they are uniquely determined by measured data. For those identifiable parameters, we employ Bayesian inference to compute posterior distributions for parameters. Finally, we propagate parameter uncertainties through the models to compute prediction intervals for quantities of interest.« less
  4. SUMMARY We introduce a new finite-element (FE) based computational framework to solve forward and inverse elastic deformation problems for earthquake faulting via the adjoint method. Based on two advanced computational libraries, FEniCS and hIPPYlib for the forward and inverse problems, respectively, this framework is flexible, transparent and easily extensible. We represent a fault discontinuity through a mixed FE elasticity formulation, which approximates the stress with higher order accuracy and exposes the prescribed slip explicitly in the variational form without using conventional split node and decomposition discrete approaches. This also allows the first order optimality condition, that is the vanishing ofmore »the gradient, to be expressed in continuous form, which leads to consistent discretizations of all field variables, including the slip. We show comparisons with the standard, pure displacement formulation and a model containing an in-plane mode II crack, whose slip is prescribed via the split node technique. We demonstrate the potential of this new computational framework by performing a linear coseismic slip inversion through adjoint-based optimization methods, without requiring computation of elastic Green’s functions. Specifically, we consider a penalized least squares formulation, which in a Bayesian setting—under the assumption of Gaussian noise and prior—reflects the negative log of the posterior distribution. The comparison of the inversion results with a standard, linear inverse theory approach based on Okada’s solutions shows analogous results. Preliminary uncertainties are estimated via eigenvalue analysis of the Hessian of the penalized least squares objective function. Our implementation is fully open-source and Jupyter notebooks to reproduce our results are provided. The extension to a fully Bayesian framework for detailed uncertainty quantification and non-linear inversions, including for heterogeneous media earthquake problems, will be analysed in a forthcoming paper.« less
  5. The Bayesian formulation of inverse problems is attractive for three primary reasons: it provides a clear modelling framework; it allows for principled learning of hyperparameters; and it can provide uncertainty quantification. The posterior distribution may in principle be sampled by means of MCMC or SMC methods, but for many problems it is computationally infeasible to do so. In this situation maximum a posteriori (MAP) estimators are often sought. Whilst these are relatively cheap to compute, and have an attractive variational formulation, a key drawback is their lack of invariance under change of parameterization; it is important to study MAP estimators,more »however, because they provide a link with classical optimization approaches to inverse problems and the Bayesian link may be used to improve upon classical optimization approaches. The lack of invariance of MAP estimators under change of parameterization is a particularly significant issue when hierarchical priors are employed to learn hyperparameters. In this paper we study the effect of the choice of parameterization on MAP estimators when a conditionally Gaussian hierarchical prior distribution is employed. Specifically we consider the centred parameterization, the natural parameterization in which the unknown state is solved for directly, and the noncentred parameterization, which works with a whitened Gaussian as the unknown state variable, and arises naturally when considering dimension-robust MCMC algorithms; MAP estimation is well-defined in the nonparametric setting only for the noncentred parameterization. However, we show that MAP estimates based on the noncentred parameterization are not consistent as estimators of hyperparameters; conversely, we show that limits of finite-dimensional centred MAP estimators are consistent as the dimension tends to infinity. We also consider empirical Bayesian hyperparameter estimation, show consistency of these estimates, and demonstrate that they are more robust with respect to noise than centred MAP estimates. An underpinning concept throughout is that hyperparameters may only be recovered up to measure equivalence, a well-known phenomenon in the context of the Ornstein–Uhlenbeck process. The applicability of the results is demonstrated concretely with the study of hierarchical Whittle–Matérn and ARD priors.« less