skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Analog ensemble data assimilation in a quasigeostrophic coupled model
Abstract The ensemble forecast dominates the computational cost of many data assimilation methods, especially for high‐resolution and coupled models. In situations where the cost is prohibitive, one can either use a lower‐cost model or a lower‐cost data assimilation method, or both. Ensemble optimal interpolation (EnOI) is a classical example of a lower‐cost ensemble data assimilation method that replaces the ensemble forecast with a single forecast and then constructs an ensemble about this single forecast by adding perturbations drawn from climatology. This research develops lower‐cost ensemble data assimilation methods that add perturbations to a single forecast, where the perturbations are obtained from analogs of the single model forecast. These analogs can either be found from a catalog of model states, constructed using linear combinations of model states from a catalog, or constructed using generative machine‐learning methods. Four analog ensemble data assimilation methods, including two new ones, are compared with EnOI in the context of a coupled model of intermediate complexity: Q‐GCM. Depending on the method and on the physical variable, analog methods can be up to 40% more accurate than EnOI.  more » « less
Award ID(s):
2152814
PAR ID:
10419732
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Quarterly Journal of the Royal Meteorological Society
Volume:
149
Issue:
752
ISSN:
0035-9009
Page Range / eLocation ID:
p. 1018-1037
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Barambones, Oscar (Ed.)
    Accurate quantification of uncertainty in solar photovoltaic (PV) generation forecasts is imperative for the efficient and reliable operation of the power grid. In this paper, a data-driven non-parametric probabilistic method based on the Naïve Bayes (NB) classification algorithm and Dempster–Shafer theory (DST) of evidence is proposed for day-ahead probabilistic PV power forecasting. This NB-DST method extends traditional deterministic solar PV forecasting methods by quantifying the uncertainty of their forecasts by estimating the cumulative distribution functions (CDFs) of their forecast errors and forecast variables. The statistical performance of this method is compared with the analog ensemble method and the persistence ensemble method under three different weather conditions using real-world data. The study results reveal that the proposed NB-DST method coupled with an artificial neural network model outperforms the other methods in that its estimated CDFs have lower spread, higher reliability, and sharper probabilistic forecasts with better accuracy. 
    more » « less
  2. Abstract State estimation in multi-layer turbulent flow fields with only a single layer of partial observation remains a challenging yet practically important task. Applications include inferring the state of the deep ocean by exploiting surface observations. Directly implementing an ensemble Kalman filter based on the full forecast model is usually expensive. One widely used method in practice projects the information of the observed layer to other layers via linear regression. However, large errors appear when nonlinearity in the highly turbulent flow field becomes dominant. In this paper, we develop a multi-step nonlinear data assimilation method that involves the sequential application of nonlinear assimilation steps across layers. Unlike traditional linear regression approaches, a conditional Gaussian nonlinear system is adopted as the approximate forecast model to characterize the nonlinear dependence between adjacent layers. At each step, samples drawn from the posterior of the current layer are treated as pseudo-observations for the next layer. Each sample is assimilated using analytic formulae for the posterior mean and covariance. The resulting Gaussian posteriors are then aggregated into a Gaussian mixture. Therefore, the method can capture strongly turbulent features, particularly intermittency and extreme events, and more accurately quantify the inherent uncertainty. Applications to the two-layer quasi-geostrophic system with Lagrangian data assimilation demonstrate that the multi-step method outperforms the one-step method, particularly as the tracer number and ensemble size increase. Results also show that the multi-step CGDA is particularly effective for assimilating frequent, high-accuracy observations, which are scenarios where traditional EnKF methods may suffer from catastrophic filter divergence. 
    more » « less
  3. Developing suitable approximate models for analyzing and simulating complex nonlinear systems is practically important. This paper aims at exploring the skill of a rich class of nonlinear stochastic models, known as the conditional Gaussian nonlinear system (CGNS), as both a cheap surrogate model and a fast preconditioner for facilitating many computationally challenging tasks. The CGNS preserves the underlying physics to a large extent and can reproduce intermittency, extreme events, and other non-Gaussian features in many complex systems arising from practical applications. Three interrelated topics are studied. First, the closed analytic formulas of solving the conditional statistics provide an efficient and accurate data assimilation scheme. It is shown that the data assimilation skill of a suitable CGNS approximate forecast model outweighs that by applying an ensemble method even to the perfect model with strong nonlinearity, where the latter suffers from filter divergence. Second, the CGNS allows the development of a fast algorithm for simultaneously estimating the parameters and the unobserved variables with uncertainty quantification in the presence of only partial observations. Utilizing an appropriate CGNS as a preconditioner significantly reduces the computational cost in accurately estimating the parameters in the original complex system. Finally, the CGNS advances rapid and statistically accurate algorithms for computing the probability density function and sampling the trajectories of the unobserved state variables. These fast algorithms facilitate the development of an efficient and accurate data-driven method for predicting the linear response of the original system with respect to parameter perturbations based on a suitable CGNS preconditioner. 
    more » « less
  4. Abstract. Localization is widely used in data assimilation schemes to mitigate the impact of sampling errors on ensemble-derived background error covariance matrices. Strongly coupled data assimilation allows observations in one component of a coupled model to directly impact another component through the inclusion of cross-domain terms in the background error covariance matrix.When different components have disparate dominant spatial scales, localization between model domains must properly account for the multiple length scales at play. In this work, we develop two new multivariate localization functions, one of which is a multivariate extension of the fifth-order piecewise rational Gaspari–Cohn localization function; the within-component localization functions are standard Gaspari–Cohn with different localization radii, while the cross-localization function is newly constructed. The functions produce positive semidefinite localization matrices which are suitable for use in both Kalman filters and variational data assimilation schemes. We compare the performance of our two new multivariate localization functions to two other multivariate localization functions and to the univariate and weakly coupled analogs of all four functions in a simple experiment with the bivariate Lorenz 96 system. In our experiments, the multivariate Gaspari–Cohn function leads to better performance than any of the other multivariate localization functions. 
    more » « less
  5. Abstract This paper identifies and explains particular differences and properties of adjoint-free iterative ensemble methods initially developed for parameter estimation in petroleum models. The aim is to demonstrate the methods’ potential for sequential data assimilation in coupled and multiscale unstable dynamical systems. For this study, we have introduced a new nonlinear and coupled multiscale model based on two Kuramoto–Sivashinsky equations operating on different scales where a coupling term relaxes the two model variables toward each other. This model provides a convenient testbed for studying data assimilation in highly nonlinear and coupled multiscale systems. We show that the model coupling leads to cross covariance between the two models’ variables, allowing for a combined update of both models. The measurements of one model’s variable will also influence the other and contribute to a more consistent estimate. Second, the new model allows us to examine the properties of iterative ensemble smoothers and assimilation updates over finite-length assimilation windows. We discuss the impact of varying the assimilation windows’ length relative to the model’s predictability time scale. Furthermore, we show that iterative ensemble smoothers significantly improve the solution’s accuracy compared to the standard ensemble Kalman filter update. Results and discussion provide an enhanced understanding of the ensemble methods’ potential implementation and use in operational weather- and climate-prediction systems. 
    more » « less