skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: DATeS: a highly extensible data assimilation testing suite v1.0
Abstract. A flexible and highly extensible data assimilation testing suite, namedDATeS, is described in this paper. DATeS aims to offer a unified testingenvironment that allows researchers to compare different data assimilationmethodologies and understand their performance in various settings. The coreof DATeS is implemented in Python and takes advantage of its object-orientedcapabilities. The main components of the package (the numerical models, thedata assimilation algorithms, the linear algebra solvers, and the timediscretization routines) are independent of each other, which offers greatflexibility to configure data assimilation applications. DATeS can interfaceeasily with large third-party numerical models written in Fortran or in C,and with a plethora of external solvers.  more » « less
Award ID(s):
1709727
PAR ID:
10133860
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Geoscientific Model Development
Volume:
12
Issue:
2
ISSN:
1991-9603
Page Range / eLocation ID:
629 to 649
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S.; Vaughan, J. Wortman (Ed.)
    The accuracy of simulation-based forecasting in chaotic systems is heavily dependent on high-quality estimates of the system state at the beginning of the forecast. Data assimilation methods are used to infer these initial conditions by systematically combining noisy, incomplete observations and numerical models of system dynamics to produce highly effective estimation schemes. We introduce a self-supervised framework, which we call \textit{amortized assimilation}, for learning to assimilate in dynamical systems. Amortized assimilation combines deep learning-based denoising with differentiable simulation, using independent neural networks to assimilate specific observation types while connecting the gradient flow between these sub-tasks with differentiable simulation and shared recurrent memory. This hybrid architecture admits a self-supervised training objective which is minimized by an unbiased estimator of the true system state even in the presence of only noisy training data. Numerical experiments across several chaotic benchmark systems highlight the improved effectiveness of our approach compared to widely-used data assimilation methods. 
    more » « less
  2. In this study, we conduct parameter estimation analysis on a data assimilation algorithm for two turbulence models: the simplified Bardina model and the Navier–Stokes-α model. Rigorous estimates are presented for the convergence of continuous data assimilation methods when the parameters of the turbulence models are not known a priori. Our approach involves creating an approximate solution for the turbulence models by employing an interpolant operator based on the observational data of the systems. The estimation depends on the parameter alpha in the models. Additionally, numerical simulations are presented to validate our theoretical results. 
    more » « less
  3. Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups. 
    more » « less
  4. With growing transistor densities, analyzing temperature in 2D and 3D integrated circuits (ICs) is becoming more complicated and critical. Finite-element solvers give accurate results, but a single transient run can take hours or even days. Compact thermal models (CTMs) shorten the temperature simulation running time using a numerical solver based on the duality between thermal and electric properties. However, CTM solvers often still take hours for small-scale chips because of iterative numerical solvers. Recent work using machine learning (ML) models creates a fast and reliable framework for predicting temperature. However, current ML models demand large input samples and hours of GPU training to reach acceptable accuracy. To overcome the challenges stated, we design an ML framework that couples with CTMs to accelerate steady-state and transient thermal analysis without large data inputs. Our framework combines principal-component analysis (PCA) with closed-form linear regression to predict the on-chip temperature directly. The linear regression weights are solved analytically, so training for a grid size of 512 × 512 finishes in under a minute with only 15–20 CTM samples. Experimental results show that our framework can achieve more than 33x and 49.6x speedup for steady-state and transient simulation of a chip with a 245.95mm^2 footprint, keeping the mean squared error below 0.1 deg C^2 . 
    more » « less
  5. Abstract Obtaining a faithful probabilistic depiction of moist convection is complicated by unknown errors in subgrid-scale physical parameterization schemes, invalid assumptions made by data assimilation (DA) techniques, and high system dimensionality. As an initial step toward untangling sources of uncertainty in convective weather regimes, we evaluate a novel Bayesian data assimilation methodology based on particle filtering within a WRF ensemble analysis and forecasting system. Unlike most geophysical DA methods, the particle filter (PF) represents prior and posterior error distributions nonparametrically rather than assuming a Gaussian distribution and can accept any type of likelihood function. This approach is known to reduce bias introduced by Gaussian approximations in low-dimensional and idealized contexts. The form of PF used in this research adopts a dimension-reduction strategy, making it affordable for typical weather applications. The present study examines posterior ensemble members and forecasts for select severe weather events between 2019 and 2020, comparing results from the PF with those from an ensemble Kalman filter (EnKF). We find that assimilating with a PF produces posterior quantities for microphysical variables that are more consistent with model climatology than comparable quantities from an EnKF, which we attribute to a reduction in DA bias. These differences are significant enough to impact the dynamic evolution of convective systems via cold pool strength and propagation, with impacts to forecast verification scores depending on the particular microphysics scheme. Our findings have broad implications for future approaches to the selection of physical parameterization schemes and parameter estimation within preexisting data assimilation frameworks. Significance StatementThe accurate prediction of severe storms using numerical weather models depends on effective parameterization schemes for small-scale processes and the assimilation of incomplete observational data in a manner that faithfully represents the probabilistic state of the atmosphere. Current generation methods for data assimilation typically assume a standard form for the error distributions of relevant quantities, which can introduce bias that not only hinders numerical prediction, but that can also confound the characterization of errors from the model itself. The current study performs data assimilation using a novel method that does not make such assumptions and explores characteristics of resulting model fields and forecasts that might make such a method useful for improving model parameterization schemes. 
    more » « less