skip to main content


Title: Implications of Multivariate Non-Gaussian Data Assimilation for Multi-scale Weather Prediction
Abstract Weather prediction models currently operate within a probabilistic framework for generating forecasts conditioned on recent measurements of Earth’s atmosphere. This framework can be conceptualized as one that approximates parts of a Bayesian posterior density estimated under assumptions of Gaussian errors. Gaussian error approximations are appropriate for synoptic-scale atmospheric flow, which experiences quasi-linear error evolution over time scales depicted by measurements, but are often hypothesized to be inappropriate for highly nonlinear, sparsely-observed mesoscale processes. The current study adopts an experimental regional modeling system to examine the impact of Gaussian prior error approximations, which are adopted by ensemble Kalman filters (EnKFs) to generate probabilistic predictions. The analysis is aided by results obtained using recently-introduced particle filter (PF) methodology that relies on an implicit non-parametric representation of prior probability densities—but with added computational expense. The investigation focuses on EnKF and PF comparisons over month-long experiments performed using an extensive domain, which features the development and passage of numerous extratropical and tropical cyclones. The experiments reveal spurious small-scale corrections in EnKF members, which come about from inappropriate Gaussian approximations for priors dominated by alignment uncertainty in mesoscale weather systems. Similar behavior is found in PF members, owing to the use of a localization operator, but to a much lesser extent. This result is reproduced and studied using a low-dimensional model, which permits the use of large sample estimates of the Bayesian posterior distribution. Findings from this study motivate the use of data assimilation techniques that provide a more appropriate specification of multivariate non-Gaussian prior densities or a multi-scale treatment of alignment errors during data assimilation.  more » « less
Award ID(s):
1848363
NSF-PAR ID:
10403593
Author(s) / Creator(s):
Date Published:
Journal Name:
Monthly Weather Review
ISSN:
0027-0644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Particle filters avoid parametric estimates for Bayesian posterior densities, which alleviates Gaussian assumptions in nonlinear regimes. These methods, however, are more sensitive to sampling errors than Gaussian-based techniques such as ensemble Kalman filters. A recent study by the authors introduced an iterative strategy for particle filters that match posterior moments—where iterations improve the filter’s ability to draw samples from non-Gaussian posterior densities. The iterations follow from a factorization of particle weights, providing a natural framework for combining particle filters with alternative filters to mitigate the impact of sampling errors. The current study introduces a novel approach to forming an adaptive hybrid data assimilation methodology, exploiting the theoretical strengths of nonparametric and parametric filters. At each data assimilation cycle, the iterative particle filter performs a sequence of updates while the prior sample distribution is non-Gaussian, then an ensemble Kalman filter provides the final adjustment when Gaussian distributions for marginal quantities are detected. The method employs the Shapiro–Wilk test to determine when to make the transition between filter algorithms, which has outstanding power for detecting departures from normality. Experiments using low-dimensional models demonstrate that the approach has a significant value, especially for nonhomogeneous observation networks and unknown model process errors. Moreover, hybrid factors are extended to consider marginals of more than one collocated variables using a test for multivariate normality. Findings from this study motivate the use of the proposed method for geophysical problems characterized by diverse observation networks and various dynamic instabilities, such as numerical weather prediction models. Significance Statement Data assimilation statistically processes observation errors and model forecast errors to provide optimal initial conditions for the forecast, playing a critical role in numerical weather forecasting. The ensemble Kalman filter, which has been widely adopted and developed in many operational centers, assumes Gaussianity of the prior distribution and solves a linear system of equations, leading to bias in strong nonlinear regimes. On the other hand, particle filters avoid many of those assumptions but are sensitive to sampling errors and are computationally expensive. We propose an adaptive hybrid strategy that combines their advantages and minimizes the disadvantages of the two methods. The hybrid particle filter–ensemble Kalman filter is achieved with the Shapiro–Wilk test to detect the Gaussianity of the ensemble members and determine the timing of the transition between these filter updates. Demonstrations in this study show that the proposed method is advantageous when observations are heterogeneous and when the model has an unknown bias. Furthermore, by extending the statistical hypothesis test to the test for multivariate normality, we consider marginals of more than one collocated variable. These results encourage further testing for real geophysical problems characterized by various dynamic instabilities, such as real numerical weather prediction models. 
    more » « less
  2. Abstract

    Iterative ensemble filters and smoothers are now commonly used for geophysical models. Some of these methods rely on a factorization of the observation likelihood function to sample from a posterior density through a set of “tempered” transitions to ensemble members. For Gaussian‐based data assimilation methods, tangent linear versions of nonlinear operators can be relinearized between iterations, thus leading to a solution that is less biased than a single‐step approach. This study adopts similar iterative strategies for a localized particle filter (PF) that relies on the estimation of moments to adjust unobserved variables based on importance weights. This approach builds off a “regularization” of the local PF, which forces weights to be more uniform through heuristic means. The regularization then leads to an adaptive tempering, which can also be combined with filter updates from parametric methods, such as ensemble Kalman filters. The role of iterations is analyzed by deriving the localized posterior probability density assumed by current local PF formulations and then examining how single‐step and tempered PFs sample from this density. From experiments performed with a low‐dimensional nonlinear system, the iterative and hybrid strategies show the largest benefits in observation‐sparse regimes, where only a few particles contain high likelihoods and prior errors are non‐Gaussian. This regime mimics specific applications in numerical weather prediction, where small ensemble sizes, unresolved model error, and highly nonlinear dynamics lead to prior uncertainty that is larger than measurement uncertainty.

     
    more » « less
  3. Abstract. Mesoscale dynamics in the mesosphere and lower thermosphere (MLT) region have been difficult to study from either ground- or satellite-based observations. For understanding of atmospheric coupling processes, important spatial scales at these altitudes range between tens and hundreds of kilometers in the horizontal plane. To date, this scale size is challenging observationally, so structures are usually parameterized in global circulation models. The advent of multistatic specular meteor radar networks allows exploration of MLT mesoscale dynamics on these scales using an increased number of detections and a diversity of viewing angles inherent to multistatic networks. In this work, we introduce a four-dimensional wind field inversion method that makes use of Gaussian process regression (GPR), which is a nonparametric and Bayesian approach. The method takes measured projected wind velocities and prior distributions of the wind velocity as a function of space and time, specified by the user or estimated from the data, and produces posterior distributions for the wind velocity. Computation of the predictive posterior distribution is performed on sampled points of interest and is not necessarily regularly sampled. The main benefits of the GPR method include this non-gridded sampling, the built-in statistical uncertainty estimates, and the ability to horizontally resolve winds on relatively small scales. The performance of the GPR implementation has been evaluated on Monte Carlo simulations with known distributions using the same spatial and temporal sampling as 1 d of real meteor measurements. Based on the simulation results we find that the GPR implementation is robust, providing wind fields that are statistically unbiased with statistical variances that depend on the geometry and are proportional to the prior velocity variances. A conservative and fast approach can be straightforwardly implemented by employing overestimated prior variances and distances, while a more robust but computationally intensive approach can be implemented by employing training and fitting of model hyperparameters. The latter GPR approach has been applied to a 24 h dataset and shown to compare well to previously used homogeneous and gradient methods. Small-scale features have reasonably low statistical uncertainties, implying geophysical wind field horizontal structures as low as 20–50 km. We suggest that this GPR approach forms a suitable method for MLT regional and weather studies. 
    more » « less
  4. Abstract The replacement of a nonlinear parameter-to-observable mapping with a linear (affine) approximation is often carried out to reduce the computational costs associated with solving large-scale inverse problems governed by partial differential equations (PDEs). In the case of a linear parameter-to-observable mapping with normally distributed additive noise and a Gaussian prior measure on the parameters, the posterior is Gaussian. However, substituting an accurate model for a (possibly well justified) linear surrogate model can give misleading results if the induced model approximation error is not accounted for. To account for the errors, the Bayesian approximation error (BAE) approach can be utilised, in which the first and second order statistics of the errors are computed via sampling. The most common linear approximation is carried out via linear Taylor expansion, which requires the computation of (Fréchet) derivatives of the parameter-to-observable mapping with respect to the parameters of interest. In this paper, we prove that the (approximate) posterior measure obtained by replacing the nonlinear parameter-to-observable mapping with a linear approximation is in fact independent of the choice of the linear approximation when the BAE approach is employed. Thus, somewhat non-intuitively, employing the zero-model as the linear approximation gives the same approximate posterior as any other choice of linear approximations of the parameter-to-observable model. The independence of the linear approximation is demonstrated mathematically and illustrated with two numerical PDE-based problems: an inverse scattering type problem and an inverse conductivity type problem. 
    more » « less
  5. Abstract

    For data assimilation to provide faithful state estimates for dynamical models, specifications of observation uncertainty need to be as accurate as possible. Innovation-based methods based on Desroziers diagnostics, are commonly used to estimate observation uncertainty, but such methods can depend greatly on the prescribed background uncertainty. For ensemble data assimilation, this uncertainty comes from statistics calculated from ensemble forecasts, which require inflation and localization to address under sampling. In this work, we use an ensemble Kalman filter (EnKF) with a low-dimensional Lorenz model to investigate the interplay between the Desroziers method and inflation. Two inflation techniques are used for this purpose: 1) a rigorously tuned fixed multiplicative scheme and 2) an adaptive state-space scheme. We document how inaccuracies in observation uncertainty affect errors in EnKF posteriors and study the combined impacts of misspecified initial observation uncertainty, sampling error, and model error on Desroziers estimates. We find that whether observation uncertainty is over- or underestimated greatly affects the stability of data assimilation and the accuracy of Desroziers estimates and that preference should be given to initial overestimates. Inline estimates of Desroziers tend to remove the dependence between ensemble spread–skill and the initially prescribed observation error. In addition, we find that the inclusion of model error introduces spurious correlations in observation uncertainty estimates. Further, we note that the adaptive inflation scheme is less robust than fixed inflation at mitigating multiple sources of error. Last, sampling error strongly exacerbates existing sources of error and greatly degrades EnKF estimates, which translates into biased Desroziers estimates of observation error covariance.

    Significance Statement

    To generate accurate predictions of various components of the Earth system, numerical models require an accurate specification of state variables at our current time. This step adopts a probabilistic consideration of our current state estimate versus information provided from environmental measurements of the true state. Various strategies exist for estimating uncertainty in observations within this framework, but are sensitive to a host of assumptions, which are investigated in this study.

     
    more » « less