Abstract Particle filters avoid parametric estimates for Bayesian posterior densities, which alleviates Gaussian assumptions in nonlinear regimes. These methods, however, are more sensitive to sampling errors than Gaussian-based techniques such as ensemble Kalman filters. A recent study by the authors introduced an iterative strategy for particle filters that match posterior moments—where iterations improve the filter’s ability to draw samples from non-Gaussian posterior densities. The iterations follow from a factorization of particle weights, providing a natural framework for combining particle filters with alternative filters to mitigate the impact of sampling errors. The current study introduces a novel approach to forming an adaptive hybrid data assimilation methodology, exploiting the theoretical strengths of nonparametric and parametric filters. At each data assimilation cycle, the iterative particle filter performs a sequence of updates while the prior sample distribution is non-Gaussian, then an ensemble Kalman filter provides the final adjustment when Gaussian distributions for marginal quantities are detected. The method employs the Shapiro–Wilk test to determine when to make the transition between filter algorithms, which has outstanding power for detecting departures from normality. Experiments using low-dimensional models demonstrate that the approach has a significant value, especially for nonhomogeneous observation networks and unknown model process errors. Moreover, hybrid factors are extended to consider marginals of more than one collocated variables using a test for multivariate normality. Findings from this study motivate the use of the proposed method for geophysical problems characterized by diverse observation networks and various dynamic instabilities, such as numerical weather prediction models. Significance Statement Data assimilation statistically processes observation errors and model forecast errors to provide optimal initial conditions for the forecast, playing a critical role in numerical weather forecasting. The ensemble Kalman filter, which has been widely adopted and developed in many operational centers, assumes Gaussianity of the prior distribution and solves a linear system of equations, leading to bias in strong nonlinear regimes. On the other hand, particle filters avoid many of those assumptions but are sensitive to sampling errors and are computationally expensive. We propose an adaptive hybrid strategy that combines their advantages and minimizes the disadvantages of the two methods. The hybrid particle filter–ensemble Kalman filter is achieved with the Shapiro–Wilk test to detect the Gaussianity of the ensemble members and determine the timing of the transition between these filter updates. Demonstrations in this study show that the proposed method is advantageous when observations are heterogeneous and when the model has an unknown bias. Furthermore, by extending the statistical hypothesis test to the test for multivariate normality, we consider marginals of more than one collocated variable. These results encourage further testing for real geophysical problems characterized by various dynamic instabilities, such as real numerical weather prediction models.
more »
« less
This content will become publicly available on September 26, 2026
A Nonlinear Data Assimilation Algorithm with Closed-Form Approximations for Multi-Layer Flow Fields
Abstract State estimation in multi-layer turbulent flow fields with only a single layer of partial observation remains a challenging yet practically important task. Applications include inferring the state of the deep ocean by exploiting surface observations. Directly implementing an ensemble Kalman filter based on the full forecast model is usually expensive. One widely used method in practice projects the information of the observed layer to other layers via linear regression. However, large errors appear when nonlinearity in the highly turbulent flow field becomes dominant. In this paper, we develop a multi-step nonlinear data assimilation method that involves the sequential application of nonlinear assimilation steps across layers. Unlike traditional linear regression approaches, a conditional Gaussian nonlinear system is adopted as the approximate forecast model to characterize the nonlinear dependence between adjacent layers. At each step, samples drawn from the posterior of the current layer are treated as pseudo-observations for the next layer. Each sample is assimilated using analytic formulae for the posterior mean and covariance. The resulting Gaussian posteriors are then aggregated into a Gaussian mixture. Therefore, the method can capture strongly turbulent features, particularly intermittency and extreme events, and more accurately quantify the inherent uncertainty. Applications to the two-layer quasi-geostrophic system with Lagrangian data assimilation demonstrate that the multi-step method outperforms the one-step method, particularly as the tracer number and ensemble size increase. Results also show that the multi-step CGDA is particularly effective for assimilating frequent, high-accuracy observations, which are scenarios where traditional EnKF methods may suffer from catastrophic filter divergence.
more »
« less
- Award ID(s):
- 2232872
- PAR ID:
- 10638752
- Publisher / Repository:
- American Meteorological Society
- Date Published:
- Journal Name:
- Monthly Weather Review
- ISSN:
- 0027-0644
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Hoteit, Ibrahim (Ed.)A hybrid particle ensemble Kalman filter is developed for problems with medium non-Gaussianity, i.e. problems where the prior is very non-Gaussian but the posterior is approximately Gaussian. Such situations arise, e.g., when nonlinear dynamics produce a non-Gaussian forecast but a tight Gaussian likelihood leads to a nearly-Gaussian posterior. The hybrid filter starts by factoring the likelihood. First the particle filter assimilates the observations with one factor of the likelihood to produce an intermediate prior that is close to Gaussian, and then the ensemble Kalman filter completes the assimilation with the remaining factor. How the likelihood gets split between the two stages is determined in such a way to ensure that the particle filter avoids collapse, and particle degeneracy is broken by a mean-preserving random orthogonal transformation. The hybrid is tested in a simple two-dimensional (2D) problem and a multiscale system of ODEs motivated by the Lorenz-‘96 model. In the 2D problem it outperforms both a pure particle filter and a pure ensemble Kalman filter, and in the multiscale Lorenz-‘96 model it is shown to outperform a pure ensemble Kalman filter, provided that the ensemble size is large enough.more » « less
-
Abstract Iterative ensemble filters and smoothers are now commonly used for geophysical models. Some of these methods rely on a factorization of the observation likelihood function to sample from a posterior density through a set of “tempered” transitions to ensemble members. For Gaussian‐based data assimilation methods, tangent linear versions of nonlinear operators can be relinearized between iterations, thus leading to a solution that is less biased than a single‐step approach. This study adopts similar iterative strategies for a localized particle filter (PF) that relies on the estimation of moments to adjust unobserved variables based on importance weights. This approach builds off a “regularization” of the local PF, which forces weights to be more uniform through heuristic means. The regularization then leads to an adaptive tempering, which can also be combined with filter updates from parametric methods, such as ensemble Kalman filters. The role of iterations is analyzed by deriving the localized posterior probability density assumed by current local PF formulations and then examining how single‐step and tempered PFs sample from this density. From experiments performed with a low‐dimensional nonlinear system, the iterative and hybrid strategies show the largest benefits in observation‐sparse regimes, where only a few particles contain high likelihoods and prior errors are non‐Gaussian. This regime mimics specific applications in numerical weather prediction, where small ensemble sizes, unresolved model error, and highly nonlinear dynamics lead to prior uncertainty that is larger than measurement uncertainty.more » « less
-
Developing suitable approximate models for analyzing and simulating complex nonlinear systems is practically important. This paper aims at exploring the skill of a rich class of nonlinear stochastic models, known as the conditional Gaussian nonlinear system (CGNS), as both a cheap surrogate model and a fast preconditioner for facilitating many computationally challenging tasks. The CGNS preserves the underlying physics to a large extent and can reproduce intermittency, extreme events, and other non-Gaussian features in many complex systems arising from practical applications. Three interrelated topics are studied. First, the closed analytic formulas of solving the conditional statistics provide an efficient and accurate data assimilation scheme. It is shown that the data assimilation skill of a suitable CGNS approximate forecast model outweighs that by applying an ensemble method even to the perfect model with strong nonlinearity, where the latter suffers from filter divergence. Second, the CGNS allows the development of a fast algorithm for simultaneously estimating the parameters and the unobserved variables with uncertainty quantification in the presence of only partial observations. Utilizing an appropriate CGNS as a preconditioner significantly reduces the computational cost in accurately estimating the parameters in the original complex system. Finally, the CGNS advances rapid and statistically accurate algorithms for computing the probability density function and sampling the trajectories of the unobserved state variables. These fast algorithms facilitate the development of an efficient and accurate data-driven method for predicting the linear response of the original system with respect to parameter perturbations based on a suitable CGNS preconditioner.more » « less
-
We present a non‐Gaussian ensemble data assimilation method based on the maximum‐likelihood ensemble filter, which allows for any combination of Gaussian, lognormal, and reverse lognormal errors in both the background and the observations. The technique is fully nonlinear, does not require a tangent linear model, and uses a Hessian preconditioner to minimise the cost function efficiently in ensemble space. When the Gaussian assumption is relaxed, the results show significant improvements in the analysis skill within two atmospheric toy models, and the performance of data assimilation systems for (semi)bounded variables is expected to improve.more » « less
An official website of the United States government
