skip to main content

Title: Ensemble Riemannian data assimilation: towards large-scale dynamical systems
Abstract. This paper presents the results of the ensemble Riemannian data assimilation for relatively high-dimensional nonlinear dynamical systems, focusing on the chaotic Lorenz-96 model and a two-layer quasi-geostrophic (QG) model of atmospheric circulation. The analysis state in this approach is inferred from a joint distribution that optimally couples the background probability distribution and the likelihood function, enabling formal treatment of systematic biases without any Gaussian assumptions. Despite the risk of the curse of dimensionality in the computation of the coupling distribution, comparisons with the classic implementation of the particle filter and the stochastic ensemble Kalman filter demonstrate that, with the same ensemble size, the presented methodology could improve the predictability of dynamical systems. In particular, under systematic errors, the root mean squared error of the analysis state can be reduced by 20 % (30 %) in the Lorenz-96 (QG) model.
Authors:
; ; ; ;
Award ID(s):
1830418 1839441 1839336
Publication Date:
NSF-PAR ID:
10384610
Journal Name:
Nonlinear Processes in Geophysics
Volume:
29
Issue:
1
Page Range or eLocation-ID:
77 to 92
ISSN:
1607-7946
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Data assimilation (DA) aims to combine observations/data with a model to maximize the utility of information for obtaining the optimal estimate. The maximum likelihood ensemble filter (MLEF) is a sequential DA method or a filter-type method. Weaknesses of the filter method are assimilating time-integrated observations and estimating empirical parameter estimation. The reason is that the forward model is employed outside of the analysis procedure in this type of DA method. To overcome these weaknesses, the MLEF is now extended as a smoother and the novel maximum likelihood ensemble smoother (MLES) is proposed. The MLES is a smoothing method with variational-like qualities, specifically in the cost function. Rather than using the error information from a single temporal location to solve for the optimal analysis update as done by the MLEF, the MLES can include observations and the forward model within a chosen time window. The newly proposed DA method is first validated by a series of rigorous and thorough performance tests using the Lorenz 96 model. Then, as DA is known to be used extensively to increase the predictability of the commonly chaotic dynamical systems seen in meteorological applications, this study demonstrates the MLES with a model chaotic problem governedmore »by the 1D Kuramoto–Sivashinky (KS) equation. Additionally, the MLES is shown to be an effective method in improving the estimate of uncertain empirical model parameters. The MLES and MLEF are then directly compared and it is shown that the performance of the MLES is adequate and that it is a good candidate for increasing the predictability of a chaotic dynamical system. Future work will focus on an extensive application of the MLES to highly turbulent flows.« less
  2. Abstract. Rejuvenation in particle filters is necessary to prevent the collapse of the weights when the number of particles is insufficient to properly sample the high-probability regions of the state space. Rejuvenation is often implemented in a heuristic manner by the addition of random noise that widens the support of the ensemble. This work aims at improving canonical rejuvenation methodology by the introduction of additional prior information obtained from climatological samples; the dynamical particles used for importance sampling are augmented with samples obtained from stochastic covariance shrinkage. A localized variant of the proposed method is developed.Numerical experiments with the Lorenz '63 model show that modified filters significantly improve the analyses for low dynamical ensemble sizes. Furthermore, localization experiments with the Lorenz '96 model show that the proposed methodology is extendable to larger systems.
  3. Abstract

    Long‐lead forecasting for spatio‐temporal systems can entail complex nonlinear dynamics that are difficult to specify a priori. Current statistical methodologies for modeling these processes are often highly parameterized and, thus, challenging to implement from a computational perspective. One potential parsimonious solution to this problem is a method from the dynamical systems and engineering literature referred to as an echo state network (ESN). ESN models usereservoir computingto efficiently compute recurrent neural network forecasts. Moreover, multilevel (deep) hierarchical models have recently been shown to be successful at predicting high‐dimensional complex nonlinear processes, particularly those with multiple spatial and temporal scales of variability (such as those we often find in spatio‐temporal environmental data). Here, we introduce a deep ensemble ESN (D‐EESN) model. Despite the incorporation of a deep structure, the presented model is computationally efficient. We present two versions of this model for spatio‐temporal processes that produce forecasts and associated measures of uncertainty. The first approach utilizes a bootstrap ensemble framework, and the second is developed within a hierarchical Bayesian framework (BD‐EESN). This more general hierarchical Bayesian framework naturally accommodates non‐Gaussian data types and multiple levels of uncertainties. The methodology is first applied to a data set simulated from a novel non‐Gaussianmore »multiscale Lorenz‐96 dynamical system simulation model and, then, to a long‐lead United States (U.S.) soil moisture forecasting application. Across both applications, the proposed methodology improves upon existing methods in terms of both forecast accuracy and quantifying uncertainty.

    « less
  4. Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143 , 897–908. ( doi:10.1002/qj.2974 )) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutzmore »JN. 2016 Proc. Natl Acad. Sci. USA 113 , 3932–3937. ( doi:10.1073/pnas.1517384113 )). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 ( http://arxiv.org/abs/1906.08829 )) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120 , 024102. ( doi:10.1103/PhysRevLett.120.024102 )) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue ‘Machine learning for weather and climate modelling’.« less
  5. Hoteit, Ibrahim (Ed.)
    A hybrid particle ensemble Kalman filter is developed for problems with medium non-Gaussianity, i.e. problems where the prior is very non-Gaussian but the posterior is approximately Gaussian. Such situations arise, e.g., when nonlinear dynamics produce a non-Gaussian forecast but a tight Gaussian likelihood leads to a nearly-Gaussian posterior. The hybrid filter starts by factoring the likelihood. First the particle filter assimilates the observations with one factor of the likelihood to produce an intermediate prior that is close to Gaussian, and then the ensemble Kalman filter completes the assimilation with the remaining factor. How the likelihood gets split between the two stages is determined in such a way to ensure that the particle filter avoids collapse, and particle degeneracy is broken by a mean-preserving random orthogonal transformation. The hybrid is tested in a simple two-dimensional (2D) problem and a multiscale system of ODEs motivated by the Lorenz-‘96 model. In the 2D problem it outperforms both a pure particle filter and a pure ensemble Kalman filter, and in the multiscale Lorenz-‘96 model it is shown to outperform a pure ensemble Kalman filter, provided that the ensemble size is large enough.