skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: BayGEN: A Bayesian Space‐Time Stochastic Weather Generator
Abstract We present a Bayesian hierarchical space‐time stochastic weather generator (BayGEN) to generate daily precipitation and minimum and maximum temperatures. BayGEN employs a hierarchical framework with data, process, and parameter layers. In the data layer, precipitation occurrence at each site is modeled using probit regression using a spatially distributed latent Gaussian process; precipitation amounts are modeled as gamma random variables; and minimum and maximum temperatures are modeled as realizations from Gaussian processes. The latent Gaussian process that drives the precipitation occurrence process is modeled in the process layer. In the parameter layer, the model parameters of the data and process layers are modeled as spatially distributed Gaussian processes, consequently enabling the simulation of daily weather at arbitrary (unobserved) locations or on a regular grid. All model parameters are endowed with weakly informative prior distributions. The No‐U Turn sampler, an adaptive form of Hamiltonian Monte Carlo, is used to maximize the model likelihood function and obtain posterior samples of each parameter. Posterior samples of the model parameters propagate uncertainty to the weather simulations, an important feature that makes BayGEN unique compared to traditional weather generators. We demonstrate the utility of BayGEN with application to daily weather generation in a basin of the Argentine Pampas. Furthermore, we evaluate the implications of crop yield by driving a crop simulation model with weather simulations from BayGEN and an equivalent non‐Bayesian weather generator.  more » « less
Award ID(s):
1811294
PAR ID:
10453571
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Volume:
55
Issue:
4
ISSN:
0043-1397
Page Range / eLocation ID:
p. 2900-2915
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper studies the fundamental problem of learning multi-layer generator models. The multi-layer generator model builds multiple layers of latent variables as a prior model on top of the generator, which benefits learning complex data distribution and hierarchical representations. However, such a prior model usually focuses on modeling inter-layer relations between latent variables by assuming non-informative (conditional) Gaussian distributions, which can be limited in model expressivity. To tackle this issue and learn more expressive prior models, we propose an energy-based model (EBM) on the joint latent space over all layers of latent variables with the multi-layer generator as its backbone. Such joint latent space EBM prior model captures the intra-layer contextual relations at each layer through layer-wise energy terms, and latent variables across different layers are jointly corrected. We develop a joint training scheme via maximum likelihood estimation (MLE), which involves Markov Chain Monte Carlo (MCMC) sampling for both prior and posterior distributions of the latent variables from different layers. To ensure efficient inference and learning, we further propose a variational training scheme where an inference model is used to amortize the costly posterior MCMC sampling. Our experiments demonstrate that the learned model can be expressive in generating high-quality images and capturing hierarchical features for better outlier detection. 
    more » « less
  2. This paper studies the fundamental problem of multi-layer generator models in learning hierarchical representations. The multi-layer generator model that consists of multiple layers of latent variables organized in a top-down architecture tends to learn multiple levels of data abstraction. However, such multi-layer latent variables are typically parameterized to be Gaussian, which can be less informative in capturing complex abstractions, resulting in limited success in hierarchical representation learning. On the other hand, the energy-based (EBM) prior is known to be expressive in capturing the data regularities, but it often lacks the hierarchical structure to capture different levels of hierarchical representations. In this paper, we propose a joint latent space EBM prior model with multi-layer latent variables for effective hierarchical representation learning. We develop a variational joint learning scheme that seamlessly integrates an inference model for efficient inference. Our experiments demonstrate that the proposed joint EBM prior is effective and expressive in capturing hierarchical representations and modeling data distribution. 
    more » « less
  3. We present a novel space‐time Bayesian hierarchical model (BHM) to reconstruct annual Sea Surface Temperature (SST) over a large domain based on SST at limited proxy (i.e., sediment core) locations. The model is tested in the equatorial Pacific. The BHM leverages Principal Component Analysis to identify dominant space‐time modes of contemporary variability of the SST field at the proxy locations and employs these modes in a Gaussian process framework to estimate SSTs across the entire domain. The BHM allows us to model the mean field and covariance, varying in space and time in the process layers of the hierarchy. Using the Markov Chain Monte Carlo (MCMC) method and suitable priors on the model parameters, posterior distributions of the model parameters and, consequently, posterior distributions of the SST fields and the attendant uncertainties are obtained for any desired year. The BHM is calibrated and validated in the contemporary period (1854–2014) and subsequently applied to reconstruct SST fields during the Holocene (0–10 ka). Results are consistent with prior inferences of La Niña‐like conditions during the Holocene. This modeling framework opens exciting prospects for modeling and reconstruction of other fields, such as precipitation, drought indices, and vegetation. 
    more » « less
  4. null (Ed.)
    A constrained stochastic weather generator (CSWG) for producing daily mean air temperature and precipitation based on annual mean air temperature and precipitation from tree-ring records is developed and tested in this paper. The principle for stochastically generating daily mean air temperature assumes that temperatures in any year can be approximated by a sinusoidal wave function plus a perturbation from the baseline. The CSWG for stochastically producing daily precipitation is based on three additional assumptions: (1) In each month, the total precipitation can be estimated from annual precipitation if there exists a relationship between the annual and monthly precipitations. If that relationship exists, then (2) for each month, the number of dry days and the maximum daily precipitation can be estimated from the total precipitation in that month. Finally, (3) in each month, there exists a probability distribution of daily precipitation amount for each wet day. These assumptions allow the development of a weather generator that constrains statistically relevant daily temperature and precipitation predictions based on a specified annual value, and thus this study presents a unique method that can be used to explore historic (e.g., archeological questions) or future (e.g., climate change) daily weather conditions based upon specified annual values. 
    more » « less
  5. Abstract Joint modeling of spatially oriented dependent variables is commonplace in the environmental sciences, where scientists seek to estimate the relationships among a set of environmental outcomes accounting for dependence among these outcomes and the spatial dependence for each outcome. Such modeling is now sought for massive data sets with variables measured at a very large number of locations. Bayesian inference, while attractive for accommodating uncertainties through hierarchical structures, can become computationally onerous for modeling massive spatial data sets because of its reliance on iterative estimation algorithms. This article develops a conjugate Bayesian framework for analyzing multivariate spatial data using analytically tractable posterior distributions that obviate iterative algorithms. We discuss differences between modeling the multivariate response itself as a spatial process and that of modeling a latent process in a hierarchical model. We illustrate the computational and inferential benefits of these models using simulation studies and analysis of a vegetation index data set with spatially dependent observations numbering in the millions. 
    more » « less