skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1835860

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract We propose and unify classes of different models for information propagation over graphs. In a first class, propagation is modelled as a wave, which emanates from a set ofknownnodes at an initial time, to all otherunknownnodes at later times with an ordering determined by the arrival time of the information wave front. A second class of models is based on the notion of a travel time along paths between nodes. The time of information propagation from an initialknownset of nodes to a node is defined as the minimum of a generalised travel time over subsets of all admissible paths. A final class is given by imposing a local equation of an eikonal form at eachunknownnode, with boundary conditions at theknownnodes. The solution value of the local equation at a node is coupled to those of neighbouring nodes with lower values. We provide precise formulations of the model classes and prove equivalences between them. Finally, we apply the front propagation models on graphs to semi-supervised learning via label propagation and information propagation on trust networks. 
    more » « less
    Free, publicly-accessible full text available October 1, 2026
  2. Abstract Cloud microphysics is a critical aspect of the Earth's climate system, which involves processes at the nano‐ and micrometer scales of droplets and ice particles. In climate modeling, cloud microphysics is commonly represented by bulk models, which contain simplified process rates that require calibration. This study presents a framework for calibrating warm‐rain bulk schemes using high‐fidelity super‐droplet simulations that provide a more accurate and physically based representation of cloud and precipitation processes. The calibration framework employs ensemble Kalman methods including Ensemble Kalman Inversion and Unscented Kalman Inversion to calibrate bulk microphysics schemes with probabilistic super‐droplet simulations. We demonstrate the framework's effectiveness by calibrating a single‐moment bulk scheme, resulting in a reduction of data‐model mismatch by more than 75% compared to the model with initial parameters. Thus, this study demonstrates a powerful tool for enhancing the accuracy of bulk microphysics schemes in atmospheric models and improving climate modeling. 
    more » « less
  3. Abstract This work integrates machine learning into an atmospheric parameterization to target uncertain mixing processes while maintaining interpretable, predictive, and well‐established physical equations. We adopt an eddy‐diffusivity mass‐flux (EDMF) parameterization for the unified modeling of various convective and turbulent regimes. To avoid drift and instability that plague offline‐trained machine learning parameterizations that are subsequently coupled with climate models, we frame learning as an inverse problem: Data‐driven models are embedded within the EDMF parameterization and trained online in a one‐dimensional vertical global climate model (GCM) column. Training is performed against output from large‐eddy simulations (LES) forced with GCM‐simulated large‐scale conditions in the Pacific. Rather than optimizing subgrid‐scale tendencies, our framework directly targets climate variables of interest, such as the vertical profiles of entropy and liquid water path. Specifically, we use ensemble Kalman inversion to simultaneously calibrate both the EDMF parameters and the parameters governing data‐driven lateral mixing rates. The calibrated parameterization outperforms existing EDMF schemes, particularly in tropical and subtropical locations of the present climate, and maintains high fidelity in simulating shallow cumulus and stratocumulus regimes under increased sea surface temperatures from AMIP4K experiments. The results showcase the advantage of physically constraining data‐driven models and directly targeting relevant variables through online learning to build robust and stable machine learning parameterizations. 
    more » « less
  4. Abstract We present a method to downscale idealized geophysical fluid simulations using generative models based on diffusion maps. By analyzing the Fourier spectra of fields drawn from different data distributions, we show how a diffusion bridge can be used as a transformation between a low resolution and a high resolution dataset, allowing for new sample generation of high-resolution fields given specific low resolution features. The ability to generate new samples allows for the computation of any statistic of interest, without any additional calibration or training. Our unsupervised setup is also designed to downscale fields without access to paired training data; this flexibility allows for the combination of multiple source and target domains without additional training. We demonstrate that the method enhances resolution and corrects context-dependent biases in geophysical fluid simulations, including in extreme events. We anticipate that the same method can be used to downscale the output of climate simulations, including temperature and precipitation fields, without needing to train a new model for each application and providing a significant computational cost savings. 
    more » « less
  5. Abstract Data required to calibrate uncertain general circulation model (GCM) parameterizations are often only available in limited regions or time periods, for example, observational data from field campaigns, or data generated in local high‐resolution simulations. This raises the question of where and when to acquire additional data to be maximally informative about parameterizations in a GCM. Here we construct a new ensemble‐based parallel algorithm to automatically target data acquisition to regions and times that maximize the uncertainty reduction, or information gain, about GCM parameters. The algorithm uses a Bayesian framework that exploits a quantified distribution of GCM parameters as a measure of uncertainty. This distribution is informed by time‐averaged climate statistics restricted to local regions and times. The algorithm is embedded in the recently developed calibrate‐emulate‐sample framework, which performs efficient model calibration and uncertainty quantification with onlymodel evaluations, compared withevaluations typically needed for traditional approaches to Bayesian calibration. We demonstrate the algorithm with an idealized GCM, with which we generate surrogates of local data. In this perfect‐model setting, we calibrate parameters and quantify uncertainties in a quasi‐equilibrium convection scheme in the GCM. We consider targeted data that are (a) localized in space for statistically stationary simulations, and (b) localized in space and time for seasonally varying simulations. In these proof‐of‐concept applications, the calculated information gain reflects the reduction in parametric uncertainty obtained from Bayesian inference when harnessing a targeted sample of data. The largest information gain typically, but not always, results from regions near the intertropical convergence zone. 
    more » « less
  6. Abstract Dynamical cores used to study the circulation of the atmosphere employ various numerical methods ranging from finite‐volume, spectral element, global spectral, and hybrid methods. In this work, we explore the use of Flux‐Differencing Discontinuous Galerkin (FDDG) methods to simulate a fully compressible dry atmosphere at various resolutions. We show that the method offers a judicious compromise between high‐order accuracy and stability for large‐eddy simulations and simulations of the atmospheric general circulation. In particular, filters, divergence damping, diffusion, hyperdiffusion, or sponge‐layers are not required to ensure stability; only the numerical dissipation naturally afforded by FDDG is necessary. We apply the method to the simulation of dry convection in an atmospheric boundary layer and in a global atmospheric dynamical core in the standard benchmark of Held and Suarez (1994,https://doi.org/10.1175/1520-0477(1994)075〈1825:apftio〉2.0.co;2). 
    more » « less
  7. Abstract Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Markov chain Monte Carlo (MCMC) methods typically requiremodel runs and are sensitive to internal variability noise, rendering them infeasible for climate models. Here we demonstrate an approach to model calibration and uncertainty quantification that requires onlymodel runs and can accommodate internal climate variability. The approach consists of three stages: (a) a calibration stage uses variants of ensemble Kalman inversion to calibrate a model by minimizing mismatches between model and data statistics; (b) an emulation stage emulates the parameter‐to‐data map with Gaussian processes (GP), using the model runs in the calibration stage for training; (c) a sampling stage approximates the Bayesian posterior distributions by sampling the GP emulator with MCMC. We demonstrate the feasibility and computational efficiency of this calibrate‐emulate‐sample (CES) approach in a perfect‐model setting. Using an idealized general circulation model, we estimate parameters in a simple convection scheme from synthetic data generated with the model. The CES approach generates probability distributions of the parameters that are good approximations of the Bayesian posteriors, at a fraction of the computational cost usually required to obtain them. Sampling from this approximate posterior allows the generation of climate predictions with quantified parametric uncertainties. 
    more » « less
  8. Abstract Most machine learning applications in Earth system modeling currently rely on gradient‐based supervised learning. This imposes stringent constraints on the nature of the data used for training (typically, residual time tendencies are needed), and it complicates learning about the interactions between machine‐learned parameterizations and other components of an Earth system model. Approaching learning about process‐based parameterizations as an inverse problem resolves many of these issues, since it allows parameterizations to be trained with partial observations or statistics that directly relate to quantities of interest in long‐term climate projections. Here, we demonstrate the effectiveness of Kalman inversion methods in treating learning about parameterizations as an inverse problem. We consider two different algorithms: unscented and ensemble Kalman inversion. Both methods involve highly parallelizable forward model evaluations, converge exponentially fast, and do not require gradient computations. In addition, unscented Kalman inversion provides a measure of parameter uncertainty. We illustrate how training parameterizations can be posed as a regularized inverse problem and solved by ensemble Kalman methods through the calibration of an eddy‐diffusivity mass‐flux scheme for subgrid‐scale turbulence and convection, using data generated by large‐eddy simulations. We find the algorithms amenable to batching strategies, robust to noise and model failures, and efficient in the calibration of hybrid parameterizations that can include empirical closures and neural networks. 
    more » « less
  9. Abstract Advances in high‐performance computing have enabled large‐eddy simulations (LES) of turbulence, convection, and clouds. However, their potential to improve parameterizations in global climate models (GCMs) is only beginning to be harnessed, with relatively few canonical LES available so far. The purpose of this paper is to begin creating a public LES library that expands the training data available for calibrating and evaluating GCM parameterizations. To do so, we use an experimental setup in which LES are driven by large‐scale forcings from GCMs, which in principle can be used at any location, any time of year, and in any climate state. We use this setup to create a library of LES of clouds across the tropics and subtropics, in the present and in a warmer climate, with a focus on the transition from stratocumulus to shallow cumulus over the East Pacific. The LES results are relatively insensitive to the choice of host GCM driving the LES. Driven with large‐scale forcing under global warming, the LES simulate a positive but weak shortwave cloud feedback, adding to the accumulating evidence that low clouds amplify global warming. 
    more » « less
  10. Abstract Climate models are generally calibrated manually by comparing selected climate statistics, such as the global top‐of‐atmosphere energy balance, to observations. The manual tuning only targets a limited subset of observational data and parameters. Bayesian calibration can estimate climate model parameters and their uncertainty using a larger fraction of the available data and automatically exploring the parameter space more broadly. In Bayesian learning, it is natural to exploit the seasonal cycle, which has large amplitude compared with anthropogenic climate change in many climate statistics. In this study, we develop methods for the calibration and uncertainty quantification (UQ) of model parameters exploiting the seasonal cycle, and we demonstrate a proof‐of‐concept with an idealized general circulation model (GCM). UQ is performed using the calibrate‐emulate‐sample approach, which combines stochastic optimization and machine learning emulation to speed up Bayesian learning. The methods are demonstrated in a perfect‐model setting through the calibration and UQ of a convective parameterization in an idealized GCM with a seasonal cycle. Calibration and UQ based on seasonally averaged climate statistics, compared to annually averaged, reduces the calibration error by up to an order of magnitude and narrows the spread of the non‐Gaussian posterior distributions by factors between two and five, depending on the variables used for UQ. The reduction in the spread of the parameter posterior distribution leads to a reduction in the uncertainty of climate model predictions. 
    more » « less