skip to main content


Title: Scikit-downscale: an open source Python package for scalable climate downscaling
Climate data from Earth System Models are increasingly being used to study the impacts of climate change on a broad range of biogeophysical (forest fires, fisheries, etc.) and human systems (reservoir operations, urban heat waves, etc.). Before this data can be used to study many of these systems, post-processing steps commonly referred to as bias correction and statistical downscaling must be performed. “Bias correction” is used to correct persistent biases in climate model output and “statistical downscaling” is used to increase the spatiotemporal resolution of the model output (i.e. 1 deg to 1/16th deg grid boxes). For our purposes, we’ll refer to both parts as “downscaling”. In the past few decades, the applications community has developed a plethora of downscaling methods. Many of these methods are ad-hoc collections of post processing routines while others target very specific applications. The proliferation of downscaling methods has left the climate applications community with an overwhelming body of research to sort through without much in the form of synthesis guiding method selection or applicability. Motivated by the pressing socio-environmental challenges of climate change – and with the learnings from previous downscaling efforts in mind – we have begun working on a community-centered open framework for climate downscaling: scikit-downscale. We believe that the community will benefit from the presence of a well-designed open source downscaling toolbox with standard interfaces alongside a repository of benchmark data to test and evaluate new and existing downscaling methods. In this notebook, we provide an overview of the scikit-downscale project, detailing how it can be used to downscale a range of surface climate variables such as air temperature and precipitation. We also highlight how scikit-downscale framework is being used to compare existing methods and how it can be extended to support the development of new downscaling methods.  more » « less
Award ID(s):
1937136
NSF-PAR ID:
10284442
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2020 EarthCube Annual Meeting
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Earth system models (ESMs) are the primary tool used to understand and project changes to the climate system. ESM projections underpin analyses of human dimensions of the climate issue, yet little is known about how ESMs are used in human dimensions research. Such foundational information is necessary for future critical assessments of ESMs. We review applications of a leading ESM, the National Center for Atmospheric Research (NCAR) Community Earth System Model (CESM), to human dimensions topics since 2004. We find that this research has grown substantially over this period, twice as fast as CESM research overall. Although many studies have primarily addressed long‐term impacts on physical systems with societal relevance, applications to managed, societal, and ecological systems have grown quickly and now make up more than half of CESM human dimensions work. CESM applications focused nearly equally on global and regional analyses, most often using multimodel ensembles, although the use of single simulations remains prevalent. Downscaling and bias correction of output was infrequent and most common for regional studies. U.S.‐based, university‐affiliated authors primarily drove human dimensions work using CESM, with only 12% of authors based at NCAR. Our findings identify important questions that warrant further investigation, such as reasons for the infrequent use of downscaling and bias correction techniques; motivations to continue to use older model versions after newer model versions have been released; and model development needs for improved human dimensions applications. Additionally, our synthesis provides a baseline and framework that enables continued tracking of CESM and other ESMs.

    This article is categorized under:

    Assessing Impacts of Climate Change > Evaluating Future Impacts of Climate Change

     
    more » « less
  2. Abstract—Numerical simulation of weather is resolution-constrained due to the high computational cost of integrating the coupled PDEs that govern atmospheric motion. For example, the most highly-resolved numerical weather prediction models are limited to approximately 3 km. However many weather and climate impacts occur over much finer scales, especially in urban areas and regions with high topographic complexity like mountains or coastal regions. Thus several statistical methods have been developed in the climate community to downscale numerical model output to finer resolutions. This is conceptually similar to image super-resolution (SR) [1] and in this work we report the results of applying SR methods to the downscaling problem. In particular we test the extent to which a SR method based on a Generative Adversarial Network (GAN) can recover a grid of wind speed from an artificially downsampled version, compared against a standard bicubic upsampling approach and another machine learning based approach, SR-CNN [1]. We use ESRGAN ([2]) to learn to downscale wind speeds by a factor of 4 from a coarse grid. We find that we can recover spatial details with higher fidelity than bicubic upsampling or SR-CNN. The bicubic and SR-CNN methods perform better than ESRGAN on coarse metrics such as MSE. However, the high frequency power spectrum is captured remarkably well by the ESRGAN, virtually identical to the real data, while bicubic and SR-CNN fidelity drops significantly at high frequency. This indicates that SR is considerably better at matching the higher-order statistics of the dataset, consistent with the observation that the generated images are of superior visual quality compared with SR-CNN. 
    more » « less
  3. Abstract

    Statistical processing of numerical model output has been a part of both weather forecasting and climate applications for decades. Statistical techniques are used to correct systematic biases in atmospheric model outputs and to represent local effects that are unresolved by the model, referred to as downscaling. Many downscaling techniques have been developed, and it has been difficult to systematically explore the implications of the individual decisions made in the development of downscaling methods. Here we describe a unified framework that enables the user to evaluate multiple decisions made in the methods used to statistically postprocess output from weather and climate models. The Ensemble Generalized Analog Regression Downscaling (En-GARD) method enables the user to select any number of input variables, predictors, mathematical transformations, and combinations for use in parametric or nonparametric downscaling approaches. En-GARD enables explicitly predicting both the probability of event occurrence and the event magnitude. Outputs from En-GARD include errors in model fit, enabling the production of an ensemble of projections through sampling of the probability distributions of each climate variable. We apply En-GARD to regional climate model simulations to evaluate the relative importance of different downscaling method choices on simulations of the current and future climate. We show that choice of predictor variables is the most important decision affecting downscaled future climate outputs, while having little impact on the fidelity of downscaled outcomes for current climate. We also show that weak statistical relationships prevent such approaches from predicting large changes in extreme events on a daily time scale.

     
    more » « less
  4. Abstract. Systematic biases and coarse resolutions are major limitations ofcurrent precipitation datasets. Many deep learning (DL)-based studies havebeen conducted for precipitation bias correction and downscaling. However,it is still challenging for the current approaches to handle complexfeatures of hourly precipitation, resulting in the incapability ofreproducing small-scale features, such as extreme events. This studydeveloped a customized DL model by incorporating customized loss functions,multitask learning and physically relevant covariates to bias correct anddownscale hourly precipitation data. We designed six scenarios tosystematically evaluate the added values of weighted loss functions,multitask learning, and atmospheric covariates compared to the regular DLand statistical approaches. The models were trained and tested using theModern-era Retrospective Analysis for Research and Applications version 2(MERRA2) reanalysis and the Stage IV radar observations over the northerncoastal region of the Gulf of Mexico on an hourly time scale. We found thatall the scenarios with weighted loss functions performed notably better thanthe other scenarios with conventional loss functions and a quantilemapping-based approach at hourly, daily, and monthly time scales as well asextremes. Multitask learning showed improved performance on capturing finefeatures of extreme events and accounting for atmospheric covariates highlyimproved model performance at hourly and aggregated time scales, while theimprovement is not as large as from weighted loss functions. We show thatthe customized DL model can better downscale and bias correct hourlyprecipitation datasets and provide improved precipitation estimates at finespatial and temporal resolutions where regular DL and statistical methodsexperience challenges. 
    more » « less
  5. Despite the widespread application of statistical downscaling tools, uncertainty remains regarding the role of model formulation in determining model skill for daily maximum and minimum temperature (TmaxandTmin), and precipitation occurrence and intensity. Impacts of several key aspects of statistical transfer function form on model skill are evaluated using a framework resistant to model overspecification. We focus on: (a) model structure: simple (generalized linear models, GLMs) versus complex (artificial neural networks, ANNs) models. (b) Predictor selection: Fixed number of predictors chosena prioriversus stepwise selection of predictors and inclusion of grid point values versus predictors derived from application of principal components analysis (PCA) to spatial fields. We also examine the influence of domain size on model performance. For precipitation downscaling, we consider the role of the threshold used to characterize a wet day and apply three approaches (Poisson and Gamma distributions in GLM and ANN) to downscale wet‐day precipitation amounts. While no downscaling formulation is optimal for all predictands and at 10 locations representing diverse U.S. climates, and due to the exclusion of variance inflation all of the downscaling formulations fail to reproduce the range of observed variability, models with larger suites of prospective predictors generally have higher skill. For temperature downscaling, ANNs generally outperform GLM, with greater improvements forTminthanTmax. Use of PCA‐derived predictors does not systematically improve model skill, but does improve skill for temperature extremes. Model skill for precipitation occurrence generally increases as the wet‐day threshold increases and models using PCA‐derived predictors tend to outperform those based on grid cell predictors. Each model for wet‐day precipitation intensity overestimates annual total precipitation and underestimates the proportion derived from extreme precipitation events, but ANN‐based models and those with larger predictor suites tend to have the smallest bias.

     
    more » « less