Abstract Many data assimilation methods require knowledge of the first two moments of the background and observation errors to function optimally. To ensure the effective performance of such methods, it is often advantageous to estimate the second moment of the observation errors directly. We examine three different strategies for doing so, focusing specifically on the case of a single scalar observation error variance parameterr. The first method is the well-known Desroziers et al. “diagnostic check” iteration (DBCP). The second method, described in Karspeck, adapts the “spread–error” diagnostic—used for assessing ensemble reliability—to observations and generates a point estimate ofrby taking the expectation of various observation-space statistics and using an ensemble to model background error statistics explicitly. The third method is an approximate Bayesian scheme that uses an inverse-gamma prior and a modified Gaussian likelihood. All three methods can recover the correct observation error variance when both the background and observation errors are Gaussian and the background error variance is well specified. We also demonstrate that it is often possible to estimatereven when the observation error is not Gaussian or when the forward operator mapping model states into observation space is nonlinear. The DBCP method is found to be most robust to these complications; however, the other two methods perform similarly well in most cases and have the added benefit that they can be used to estimaterbefore data assimilation. We conclude that further investigation is warranted into the latter two methods, specifically into how they perform when extended to the multivariate case. Significance StatementObservations of the Earth system (e.g., from satellites, radiosondes, aircraft, etc.,) each have some associated uncertainty. To use observations to improve model forecasts, it is important to understand the size of that uncertainty. This study compares three statistical methods for estimating observation errors, all of which can be continuously implemented whenever new observations are used to correct a model. Our results suggest that all three methods can improve forecast outcomes, but that, if observations are believed to have highly biased or skewed errors, care should be taken in choosing which to use and interpreting its results. Future studies should investigate robust methods for estimating more complicated types of errors.
more »
« less
Sensitivity of a data-assimilation system for reconstructing three-dimensional cardiac electrical dynamics
Modelling of cardiac electrical behaviour has led to important mechanistic insights, but important challenges, including uncertainty in model formulations and parameter values, make it difficult to obtain quantitatively accurate results. An alternative approach is combining models with observations from experiments to produce a data-informed reconstruction of system states over time. Here, we extend our earlier data-assimilation studies using an ensemble Kalman filter to reconstruct a three-dimensional time series of states with complex spatio-temporal dynamics using only surface observations of voltage. We consider the effects of several algorithmic and model parameters on the accuracy of reconstructions of known scroll-wave truth states using synthetic observations. In particular, we study the algorithm’s sensitivity to parameters governing different parts of the process and its robustness to several model-error conditions. We find that the algorithm can achieve an acceptable level of error in many cases, with the weakest performance occurring for model-error cases and more extreme parameter regimes with more complex dynamics. Analysis of the poorest-performing cases indicates an initial decrease in error followed by an increase when the ensemble spread is reduced. Our results suggest avenues for further improvement through increasing ensemble spread by incorporating additive inflation or using a parameter or multi-model ensemble. This article is part of the theme issue ‘Uncertainty quantification in cardiac and cardiovascular modelling and simulation’.
more »
« less
- Award ID(s):
- 1762803
- PAR ID:
- 10273172
- Date Published:
- Journal Name:
- Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- Volume:
- 378
- Issue:
- 2173
- ISSN:
- 1364-503X
- Page Range / eLocation ID:
- 20190388
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Fu, Feng (Ed.)Undetected infections fuel the dissemination of many infectious agents. However, identification of unobserved infectious individuals remains challenging due to limited observations of infections and imperfect knowledge of key transmission parameters. Here, we use an ensemble Bayesian inference method to infer unobserved infections using partial observations. The ensemble inference method can represent uncertainty in model parameters and update model states using all ensemble members collectively. We perform extensive experiments in both model-generated and real-world networks in which individuals have differential but unknown transmission rates. The ensemble method outperforms several alternative approaches for a variety of network structures and observation rates, despite that the model is mis-specified. Additionally, the computational complexity of this algorithm scales almost linearly with the number of nodes in the network and the number of observations, respectively, exhibiting the potential to apply to large-scale networks. The inference method may support decision-making under uncertainty and be adapted for use for other dynamical models in networks.more » « less
-
This work demonstrates the efficiency of using iterative ensemble smoothers to estimate the parameters of an SEIR model. We have extended a standard SEIR model with age-classes and compartments of sick, hospitalized, and dead. The data conditioned on are the daily numbers of accumulated deaths and the number of hospitalized. Also, it is possible to condition the model on the number of cases obtained from testing. We start from a wide prior distribution for the model parameters; then, the ensemble conditioning leads to a posterior ensemble of estimated parameters yielding model predictions in close agreement with the observations. The updated ensemble of model simulations has predictive capabilities and include uncertainty estimates. In particular, we estimate the effective reproductive number as a function of time, and we can assess the impact of different intervention measures. By starting from the updated set of model parameters, we can make accurate short-term predictions of the epidemic development assuming knowledge of the future effective reproductive number. Also, the model system allows for the computation of long-term scenarios of the epidemic under different assumptions. We have applied the model system on data sets from several countries, i.e., the four European countries Norway, England, The Netherlands, and France; the province of Quebec in Canada; the South American countries Argentina and Brazil; and the four US states Alabama, North Carolina, California, and New York. These countries and states all have vastly different developments of the epidemic, and we could accurately model the SARS-CoV-2 outbreak in all of them. We realize that more complex models, e.g., with regional compartments, may be desirable, and we suggest that the approach used here should be applicable also for these models.more » « less
-
Probabilistic hazard assessments for studying overland pyroclastic flows or atmospheric ash clouds under short timelines of an evolving crisis, require using the best science available unhampered by complicated and slow manual workflows. Although deterministic mathematical models are available, in most cases, parameters and initial conditions for the equations are usually only known within a prescribed range of uncertainty. For the construction of probabilistic hazard assessments, accurate outputs and propagation of the inherent input uncertainty to quantities of interest are needed to estimate necessary probabilities based on numerous runs of the underlying deterministic model. Characterizing the uncertainty in system states due to parametric and input uncertainty, simultaneously, requires using ensemble based methods to explore the full parameter and input spaces. Complex tasks, such as running thousands of instances of a deterministic model with parameter and input uncertainty require a High Performance Computing infrastructure and skilled personnel that may not be readily available to the policy makers responsible for making informed risk mitigation decisions. For efficiency, programming tasks required for executing ensemble simulations need to run in parallel, leading to twin computational challenges of managing large amounts of data and performing CPU intensive processing. The resulting flow of work requires complex sequences of tasks, interactions, and exchanges of data, hence the automatic management of these workflows are essential. Here we discuss a computer infrastructure, methodology and tools which enable scientists and other members of the volcanology research community to develop workflows for construction of probabilistic hazard maps using remotely accessed computing through a web portal.more » « less
-
Abstract Climate models are generally calibrated manually by comparing selected climate statistics, such as the global top‐of‐atmosphere energy balance, to observations. The manual tuning only targets a limited subset of observational data and parameters. Bayesian calibration can estimate climate model parameters and their uncertainty using a larger fraction of the available data and automatically exploring the parameter space more broadly. In Bayesian learning, it is natural to exploit the seasonal cycle, which has large amplitude compared with anthropogenic climate change in many climate statistics. In this study, we develop methods for the calibration and uncertainty quantification (UQ) of model parameters exploiting the seasonal cycle, and we demonstrate a proof‐of‐concept with an idealized general circulation model (GCM). UQ is performed using the calibrate‐emulate‐sample approach, which combines stochastic optimization and machine learning emulation to speed up Bayesian learning. The methods are demonstrated in a perfect‐model setting through the calibration and UQ of a convective parameterization in an idealized GCM with a seasonal cycle. Calibration and UQ based on seasonally averaged climate statistics, compared to annually averaged, reduces the calibration error by up to an order of magnitude and narrows the spread of the non‐Gaussian posterior distributions by factors between two and five, depending on the variables used for UQ. The reduction in the spread of the parameter posterior distribution leads to a reduction in the uncertainty of climate model predictions.more » « less
An official website of the United States government

