skip to main content


Title: Probabilistic model-error assessment of deep learning proxies: an application to real-time inversion of borehole electromagnetic measurements
SUMMARY

The advent of fast sensing technologies allow for real-time model updates in many applications where the model parameters are uncertain. Once the observations are collected, Bayesian algorithms offer a pathway for real-time inversion (a.k.a. model parameters/inputs update) because of the flexibility of the Bayesian framework against non-uniqueness and uncertainties. However, Bayesian algorithms rely on the repeated evaluation of the computational models and deep learning (DL) based proxies can be useful to address this computational bottleneck. In this paper, we study the effects of the approximate nature of the deep learned models and associated model errors during the inversion of borehole electromagnetic (EM) measurements, which are usually obtained from logging while drilling. We rely on the iterative ensemble smoothers as an effective algorithm for real-time inversion due to its parallel nature and relatively low computational cost. The real-time inversion of EM measurements is used to determine the subsurface geology and properties, which are critical for real-time adjustments of the well trajectory (geosteering). The use of deep neural network (DNN) as a forward model allows us to perform thousands of model evaluations within seconds, which is very useful to quantify uncertainties and non-uniqueness in real-time. While significant efforts are usually made to ensure the accuracy of the DL models, it is widely known that the DNNs can contain some type of model-error in the regions not covered by the training data, which are unknown and training specific. When the DL models are utilized during inversion of EM measurements, the effects of the model-errors could manifest themselves as a bias in the estimated input parameters and as a consequence might result in a low-quality geosteering decision. We present numerical results highlighting the challenges associated with the inversion of EM measurements while neglecting model-error. We further demonstrate the utility of a recently proposed flexible iterative ensemble smoother in reducing the effect of model-bias by capturing the unknown model-errors, thus improving the quality of the estimated subsurface properties for geosteering operation. Moreover, we describe a procedure for identifying inversion multimodality and propose possible solutions to alleviate it in real-time.

 
more » « less
NSF-PAR ID:
10367306
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Geophysical Journal International
Volume:
230
Issue:
3
ISSN:
0956-540X
Page Range / eLocation ID:
p. 1800-1817
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Particle filters avoid parametric estimates for Bayesian posterior densities, which alleviates Gaussian assumptions in nonlinear regimes. These methods, however, are more sensitive to sampling errors than Gaussian-based techniques such as ensemble Kalman filters. A recent study by the authors introduced an iterative strategy for particle filters that match posterior moments—where iterations improve the filter’s ability to draw samples from non-Gaussian posterior densities. The iterations follow from a factorization of particle weights, providing a natural framework for combining particle filters with alternative filters to mitigate the impact of sampling errors. The current study introduces a novel approach to forming an adaptive hybrid data assimilation methodology, exploiting the theoretical strengths of nonparametric and parametric filters. At each data assimilation cycle, the iterative particle filter performs a sequence of updates while the prior sample distribution is non-Gaussian, then an ensemble Kalman filter provides the final adjustment when Gaussian distributions for marginal quantities are detected. The method employs the Shapiro–Wilk test to determine when to make the transition between filter algorithms, which has outstanding power for detecting departures from normality. Experiments using low-dimensional models demonstrate that the approach has a significant value, especially for nonhomogeneous observation networks and unknown model process errors. Moreover, hybrid factors are extended to consider marginals of more than one collocated variables using a test for multivariate normality. Findings from this study motivate the use of the proposed method for geophysical problems characterized by diverse observation networks and various dynamic instabilities, such as numerical weather prediction models. Significance Statement Data assimilation statistically processes observation errors and model forecast errors to provide optimal initial conditions for the forecast, playing a critical role in numerical weather forecasting. The ensemble Kalman filter, which has been widely adopted and developed in many operational centers, assumes Gaussianity of the prior distribution and solves a linear system of equations, leading to bias in strong nonlinear regimes. On the other hand, particle filters avoid many of those assumptions but are sensitive to sampling errors and are computationally expensive. We propose an adaptive hybrid strategy that combines their advantages and minimizes the disadvantages of the two methods. The hybrid particle filter–ensemble Kalman filter is achieved with the Shapiro–Wilk test to detect the Gaussianity of the ensemble members and determine the timing of the transition between these filter updates. Demonstrations in this study show that the proposed method is advantageous when observations are heterogeneous and when the model has an unknown bias. Furthermore, by extending the statistical hypothesis test to the test for multivariate normality, we consider marginals of more than one collocated variable. These results encourage further testing for real geophysical problems characterized by various dynamic instabilities, such as real numerical weather prediction models. 
    more » « less
  2. Abstract

    Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Markov chain Monte Carlo (MCMC) methods typically requiremodel runs and are sensitive to internal variability noise, rendering them infeasible for climate models. Here we demonstrate an approach to model calibration and uncertainty quantification that requires onlymodel runs and can accommodate internal climate variability. The approach consists of three stages: (a) a calibration stage uses variants of ensemble Kalman inversion to calibrate a model by minimizing mismatches between model and data statistics; (b) an emulation stage emulates the parameter‐to‐data map with Gaussian processes (GP), using the model runs in the calibration stage for training; (c) a sampling stage approximates the Bayesian posterior distributions by sampling the GP emulator with MCMC. We demonstrate the feasibility and computational efficiency of this calibrate‐emulate‐sample (CES) approach in a perfect‐model setting. Using an idealized general circulation model, we estimate parameters in a simple convection scheme from synthetic data generated with the model. The CES approach generates probability distributions of the parameters that are good approximations of the Bayesian posteriors, at a fraction of the computational cost usually required to obtain them. Sampling from this approximate posterior allows the generation of climate predictions with quantified parametric uncertainties.

     
    more » « less
  3. Deep Learning (DL) methods have been transforming computer vision with innovative adaptations to other domains including climate change. For DL to pervade Science and Engineering (S&EE) applications where risk management is a core component, well-characterized uncertainty estimates must accompany predictions. However, S&E observations and model-simulations often follow heavily skewed distributions and are not well modeled with DL approaches, since they usually optimize a Gaussian, or Euclidean, likelihood loss. Recent developments in Bayesian Deep Learning (BDL), which attempts to capture uncertainties from noisy observations, aleatoric, and from unknown model parameters, epistemic, provide us a foundation. Here we present a discrete-continuous BDL model with Gaussian and lognormal likelihoods for uncertainty quantification (UQ). We demonstrate the approach by developing UQ estimates on “DeepSD’‘, a super-resolution based DL model for Statistical Downscaling (SD) in climate applied to precipitation, which follows an extremely skewed distribution. We find that the discrete-continuous models outperform a basic Gaussian distribution in terms of predictive accuracy and uncertainty calibration. Furthermore, we find that the lognormal distribution, which can handle skewed distributions, produces quality uncertainty estimates at the extremes. Such results may be important across S&E, as well as other domains such as finance and economics, where extremes are often of significant interest. Furthermore, to our knowledge, this is the first UQ model in SD where both aleatoric and epistemic uncertainties are characterized. 
    more » « less
  4. Abstract

    Identification of a heterogeneous conductivity field and reconstruction of a contaminant release history are key aspects of subsurface remediation. These two goals are achieved by combining model predictions with sparse and noisy hydraulic head and concentration measurements. Solution of this inverse problem is notoriously difficult due to, in part, high dimensionality of the parameter space and high computational cost of repeated forward solves. We use a convolutional adversarial autoencoder (CAAE) to parameterize a heterogeneous non‐Gaussian conductivity field via a low‐dimensional latent representation. A three‐dimensional dense convolutional encoder‐decoder (DenseED) network serves as a forward surrogate of the flow and transport model. The CAAE‐DenseED surrogate is fed into the ensemble smoother with multiple data assimilation (ESMDA) algorithm to sample from the Bayesian posterior distribution of the unknown parameters, forming a CAAE‐DenseED‐ESMDA inversion framework. The resulting CAAE‐DenseED‐ESMDA inversion strategy is used to identify a three‐dimensional contaminant source and conductivity field. A comparison of the inversion results from CAAE‐ESMDA with physical flow and transport simulator and from CAAE‐DenseED‐ESMDA shows that the latter yields accurate reconstruction results at the fraction of the computational cost of the former.

     
    more » « less
  5. Abstract

    Raindrop size distributions (DSD) and rain rate have been estimated from polarimetric radar data using different approaches with the accuracy depending on the errors both in the radar measurements and the estimation methods. Herein, a deep neural network (DNN) technique was utilized to improve the estimation of the DSD and rain rate by mitigating these errors. The performance of this approach was evaluated using measurements from a two-dimensional video disdrometer (2DVD) at the Kessler Atmospheric and Ecological Field Station in Oklahoma as ground truth with the results compared against conventional estimation methods for the period 2006–17. Physical parameters (mass-/volume-weighted diameter and liquid water content), rain rate, and polarimetric radar variables (including radar reflectivity and differential reflectivity) were obtained from the DSD data. Three methods—physics-based inversion, empirical formula, and DNN—were applied to two different temporal domains (instantaneous and rain-event average) with three diverse error assumptions (fitting, measurement, and model errors). The DSD retrievals and rain estimates from 18 cases were evaluated by calculating the bias and root-mean-squared error (RMSE). DNN produced the best performance for most cases, with up to a 5% reduction in RMSE when model errors existed. DSD and rain estimated from a nearby polarimetric radar using the empirical and DNN methods were well correlated with the disdrometer observations; the rain-rate estimate bias of the DNN was significantly reduced (3.3% in DNN vs 50.1% in empirical). These results suggest that DNN has advantages over the physics-based and empirical methods in retrieving rain microphysics from radar observations.

     
    more » « less