skip to main content

Title: Ensemble Kalman filter updates based on regularized sparse inverse Cholesky factors
Abstract The ensemble Kalman filter (EnKF) is a popular technique for data assimilation in high-dimensional nonlinear state-space models. The EnKF represents distributions of interest by an ensemble, which is a form of dimension reduction that enables straightforward forecasting even for complicated and expensive evolution operators. However, the EnKF update step involves estimation of the forecast covariance matrix based on the (often small) ensemble, which requires regularization. Many existing regularization techniques rely on spatial localization, which may ignore long-range dependence. Instead, our proposed approach assumes a sparse Cholesky factor of the inverse covariance matrix, and the nonzero Cholesky entries are further regularized. The resulting method is highly flexible and computationally scalable. In our numerical experiments, our approach was more accurate and less sensitive to misspecification of tuning parameters than tapering-based localization.
Authors:
;
Award ID(s):
1934904 1953005 1654083
Publication Date:
NSF-PAR ID:
10277276
Journal Name:
Monthly Weather Review
ISSN:
0027-0644
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. Ever since its inception, the ensemble Kalman filter (EnKF) has elicited many heuristic approaches that sought to improve it. One such method is covariance localization, which alleviates spurious correlations due to finite ensemble sizes by using relevant spatial correlation information. Adaptive localization techniques account for how correlations change in time and space, in order to obtain improved covariance estimates. This work develops a Bayesian approach to adaptive Schur-product localization for the deterministic ensemble Kalman filter (DEnKF) and extends it to support multiple radii of influence. We test the proposed adaptive localization using the toy Lorenz'96 problem and a moremore »realistic 1.5-layer quasi-geostrophic model. Results with the toy problem show that the multivariate approach informs us that strongly observed variables can tolerate larger localization radii. The univariate approach leads to markedly improved filter performance for the realistic geophysical model, with a reduction in error by as much as 33 %.« less
  2. Abstract Covariance matrices are fundamental to the analysis and forecast of economic, physical and biological systems. Although the eigenvalues $\{\lambda _i\}$ and eigenvectors $\{\boldsymbol{u}_i\}$ of a covariance matrix are central to such endeavours, in practice one must inevitably approximate the covariance matrix based on data with finite sample size $n$ to obtain empirical eigenvalues $\{\tilde{\lambda }_i\}$ and eigenvectors $\{\tilde{\boldsymbol{u}}_i\}$, and therefore understanding the error so introduced is of central importance. We analyse eigenvector error $\|\boldsymbol{u}_i - \tilde{\boldsymbol{u}}_i \|^2$ while leveraging the assumption that the true covariance matrix having size $p$ is drawn from a matrix ensemble with known spectral properties—particularly,more »we assume the distribution of population eigenvalues weakly converges as $p\to \infty $ to a spectral density $\rho (\lambda )$ and that the spacing between population eigenvalues is similar to that for the Gaussian orthogonal ensemble. Our approach complements previous analyses of eigenvector error that require the full set of eigenvalues to be known, which can be computationally infeasible when $p$ is large. To provide a scalable approach for uncertainty quantification of eigenvector error, we consider a fixed eigenvalue $\lambda $ and approximate the distribution of the expected square error $r= \mathbb{E}\left [\| \boldsymbol{u}_i - \tilde{\boldsymbol{u}}_i \|^2\right ]$ across the matrix ensemble for all $\boldsymbol{u}_i$ associated with $\lambda _i=\lambda $. We find, for example, that for sufficiently large matrix size $p$ and sample size $n> p$, the probability density of $r$ scales as $1/nr^2$. This power-law scaling implies that the eigenvector error is extremely heterogeneous—even if $r$ is very small for most eigenvectors, it can be large for others with non-negligible probability. We support this and further results with numerical experiments.« less
  3. Ensemble Kalman filter (EnKF) analyses of the storms associated with the 8 May 2017 Colorado severe hail event using either the Milbrandt and Yau (MY) or the NSSL double-moment bulk microphysics scheme in the forecast model are evaluated. With each scheme, two experiments are conducted in which the reflectivity ( Z) observations update in addition to dynamic and thermodynamic variables: 1) only the hydrometeor mixing ratios or 2) all microphysical variables. With fewer microphysical variables directly constrained by the Z observations, only updating hydrometeor mixing ratios causes the forecast error covariance structure to become unreliable, and results in larger errorsmore »in the analysis. Experiments that update all microphysical variables produce analyses with the lowest Z root-mean-square innovations; however, comparing the estimated hail size against hydrometeor classification algorithm output suggests that further constraint from observations is needed to more accurately estimate surface hail size. Ensemble correlation analyses are performed to determine the impact of hail growth assumptions in the MY and NSSL schemes on the forecast error covariance between microphysical and thermodynamic variables. In the MY scheme, Z is negatively correlated with updraft intensity because the strong updrafts produce abundant small hail aloft. The NSSL scheme predicts the growth of large hail aloft; consequently, Z is positively correlated with storm updraft intensity and hail state variables. Hail production processes are also shown to alter the background error covariance for liquid and frozen hydrometeor species. Results in this study suggest that EnKF analyses are sensitive to the choice of MP scheme (e.g., the treatment of hail growth processes).

    « less
  4. We present an ensemble filtering method based on a linear model for the precision matrix (the inverse of the covariance) with the parameters determined by Score Matching Estimation. The method provides a rigorous covariance regularization when the underlying random field is Gaussian Markov. The parameters are found by solving a system of linear equations. The analysis step uses the inverse formulation of the Kalman update. Several filter versions, differing in the construction of the analysis ensemble, are proposed, as well as a Score matching version of the Extended Kalman Filter.

  5. To improve Thermosphere–Ionosphere modeling during disturbed conditions, data assimilation schemes that can account for the large and fast-moving gradients moving through the modeled domain are necessary. We argue that this requires a physics based background model with a non-stationary covariance. An added benefit of using physics-based models would be improved forecasting capability over largely persistence-based forecasts of empirical models. As a reference implementation, we have developed an ensemble Kalman Filter (enKF) software called Thermosphere Ionosphere Data Assimilation (TIDA) using the physics-based Coupled Thermosphere Ionosphere Plasmasphere electrodynamics (CTIPe) model as the background. In this paper, we present detailed results from experimentsmore »during the 2003 Halloween Storm, 27–31 October 2003, under very disturbed ( K p  = 9) conditions while assimilating GRACE-A and B, and CHAMP neutral density measurements. TIDA simulates this disturbed period without using the L1 solar wind measurements, which were contaminated by solar energetic protons, by estimating the model drivers from the density measurements. We also briefly present statistical results for two additional storms: September 27 – October 2, 2002, and July 26 – 30, 2004, to show that the improvement in assimilated neutral density specification is not an artifact of the corrupted forcing observations during the 2003 Halloween Storm. By showing statistical results from assimilating one satellite at a time, we show that TIDA produces a coherent global specification for neutral density throughout the storm – a critical capability in calculating satellite drag and debris collision avoidance for space traffic management.« less