skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: DATeS: a highly extensible data assimilation testing suite v1.0
Abstract. A flexible and highly extensible data assimilation testing suite, namedDATeS, is described in this paper. DATeS aims to offer a unified testingenvironment that allows researchers to compare different data assimilationmethodologies and understand their performance in various settings. The coreof DATeS is implemented in Python and takes advantage of its object-orientedcapabilities. The main components of the package (the numerical models, thedata assimilation algorithms, the linear algebra solvers, and the timediscretization routines) are independent of each other, which offers greatflexibility to configure data assimilation applications. DATeS can interfaceeasily with large third-party numerical models written in Fortran or in C,and with a plethora of external solvers.  more » « less
Award ID(s):
1709727
PAR ID:
10133860
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Geoscientific Model Development
Volume:
12
Issue:
2
ISSN:
1991-9603
Page Range / eLocation ID:
629 to 649
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S.; Vaughan, J. Wortman (Ed.)
    The accuracy of simulation-based forecasting in chaotic systems is heavily dependent on high-quality estimates of the system state at the beginning of the forecast. Data assimilation methods are used to infer these initial conditions by systematically combining noisy, incomplete observations and numerical models of system dynamics to produce highly effective estimation schemes. We introduce a self-supervised framework, which we call \textit{amortized assimilation}, for learning to assimilate in dynamical systems. Amortized assimilation combines deep learning-based denoising with differentiable simulation, using independent neural networks to assimilate specific observation types while connecting the gradient flow between these sub-tasks with differentiable simulation and shared recurrent memory. This hybrid architecture admits a self-supervised training objective which is minimized by an unbiased estimator of the true system state even in the presence of only noisy training data. Numerical experiments across several chaotic benchmark systems highlight the improved effectiveness of our approach compared to widely-used data assimilation methods. 
    more » « less
  2. In this study, we conduct parameter estimation analysis on a data assimilation algorithm for two turbulence models: the simplified Bardina model and the Navier–Stokes-α model. Rigorous estimates are presented for the convergence of continuous data assimilation methods when the parameters of the turbulence models are not known a priori. Our approach involves creating an approximate solution for the turbulence models by employing an interpolant operator based on the observational data of the systems. The estimation depends on the parameter alpha in the models. Additionally, numerical simulations are presented to validate our theoretical results. 
    more » « less
  3. Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups. 
    more » « less
  4. Abstract Obtaining a faithful probabilistic depiction of moist convection is complicated by unknown errors in subgrid-scale physical parameterization schemes, invalid assumptions made by data assimilation (DA) techniques, and high system dimensionality. As an initial step toward untangling sources of uncertainty in convective weather regimes, we evaluate a novel Bayesian data assimilation methodology based on particle filtering within a WRF ensemble analysis and forecasting system. Unlike most geophysical DA methods, the particle filter (PF) represents prior and posterior error distributions nonparametrically rather than assuming a Gaussian distribution and can accept any type of likelihood function. This approach is known to reduce bias introduced by Gaussian approximations in low-dimensional and idealized contexts. The form of PF used in this research adopts a dimension-reduction strategy, making it affordable for typical weather applications. The present study examines posterior ensemble members and forecasts for select severe weather events between 2019 and 2020, comparing results from the PF with those from an ensemble Kalman filter (EnKF). We find that assimilating with a PF produces posterior quantities for microphysical variables that are more consistent with model climatology than comparable quantities from an EnKF, which we attribute to a reduction in DA bias. These differences are significant enough to impact the dynamic evolution of convective systems via cold pool strength and propagation, with impacts to forecast verification scores depending on the particular microphysics scheme. Our findings have broad implications for future approaches to the selection of physical parameterization schemes and parameter estimation within preexisting data assimilation frameworks. Significance StatementThe accurate prediction of severe storms using numerical weather models depends on effective parameterization schemes for small-scale processes and the assimilation of incomplete observational data in a manner that faithfully represents the probabilistic state of the atmosphere. Current generation methods for data assimilation typically assume a standard form for the error distributions of relevant quantities, which can introduce bias that not only hinders numerical prediction, but that can also confound the characterization of errors from the model itself. The current study performs data assimilation using a novel method that does not make such assumptions and explores characteristics of resulting model fields and forecasts that might make such a method useful for improving model parameterization schemes. 
    more » « less
  5. Abstract Global solar photospheric magnetic maps play a critical role in solar and heliospheric physics research. Routine magnetograph measurements of the field occur only along the Sun–Earth line, leaving the far side of the Sun unobserved. Surface flux transport (SFT) models attempt to mitigate this by modeling the surface evolution of the field. While such models have long been established in the community (with several releasing public full-Sun maps), none are open source. The Open-source Flux Transport (OFT) model seeks to fill this gap by providing an open and user-extensible SFT model that also builds on the knowledge of previous models with updated numerical and data acquisition/assimilation methods along with additional user-defined features. In this first of a series of papers on OFT, we introduce its computational core: the High-performance Flux Transport (HipFT) code (https://github.com/predsci/hipft). HipFT implements advection, diffusion, and data assimilation in a modular design that supports a variety of flow models and options. It can compute multiple realizations in a single run across model parameters to create ensembles of maps for uncertainty quantification and is high-performance through the use of multi-CPU and multi-GPU parallelism. HipFT is designed to enable users to write extensions easily, enhancing its flexibility and adaptability. We describe HipFT’s model features, validations of its numerical methods, performance of its parallel and GPU-accelerated code implementation, analysis/postprocessing options, and example use cases. 
    more » « less