skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: A hierarchical analysis of the impact of methodological decisions on statistical downscaling of daily precipitation and air temperatures

Despite the widespread application of statistical downscaling tools, uncertainty remains regarding the role of model formulation in determining model skill for daily maximum and minimum temperature (TmaxandTmin), and precipitation occurrence and intensity. Impacts of several key aspects of statistical transfer function form on model skill are evaluated using a framework resistant to model overspecification. We focus on: (a) model structure: simple (generalized linear models, GLMs) versus complex (artificial neural networks, ANNs) models. (b) Predictor selection: Fixed number of predictors chosena prioriversus stepwise selection of predictors and inclusion of grid point values versus predictors derived from application of principal components analysis (PCA) to spatial fields. We also examine the influence of domain size on model performance. For precipitation downscaling, we consider the role of the threshold used to characterize a wet day and apply three approaches (Poisson and Gamma distributions in GLM and ANN) to downscale wet‐day precipitation amounts. While no downscaling formulation is optimal for all predictands and at 10 locations representing diverse U.S. climates, and due to the exclusion of variance inflation all of the downscaling formulations fail to reproduce the range of observed variability, models with larger suites of prospective predictors generally have higher skill. For temperature downscaling, ANNs generally outperform GLM, with greater improvements forTminthanTmax. Use of PCA‐derived predictors does not systematically improve model skill, but does improve skill for temperature extremes. Model skill for precipitation occurrence generally increases as the wet‐day threshold increases and models using PCA‐derived predictors tend to outperform those based on grid cell predictors. Each model for wet‐day precipitation intensity overestimates annual total precipitation and underestimates the proportion derived from extreme precipitation events, but ANN‐based models and those with larger predictor suites tend to have the smallest bias.

 
more » « less
NSF-PAR ID:
10461245
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
International Journal of Climatology
Volume:
39
Issue:
6
ISSN:
0899-8418
Page Range / eLocation ID:
p. 2880-2900
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Short‐term forecasting of wind gusts, particularly those of higher intensity, is of great societal importance but is challenging due to the presence of multiple gust generation mechanisms. Wind gust observations from eight high‐passenger‐volume airports across the continental United States (CONUS) are summarized and used to develop predictive models of wind gust occurrence and magnitude. These short‐term (same hour) forecast models are built using multiple logistic and linear regression, as well as artificial neural networks (ANNs) of varying complexity. A suite of 19 upper‐air predictors drawn from the ERA5 reanalysis and an autoregressive (AR) term are used. Stepwise procedures instruct predictor selection, and resampling is used to quantify model stability. All models are developed separately for the warm (April–September) and cold (October–March) seasons. Results show that ANNs of 3–5 hidden layers (HLs) generally exhibit higher hit rates than logistic regression models and also improve skill with respect to wind gust magnitudes. However, deeper networks with more HLs increase false alarm rates in occurrence models and mean absolute error in magnitude models due to model overfitting. For model skill, inclusion of the AR term is critical while the majority of the remaining skill derives from wind speeds and lapse rates. A predictive ceiling is also clearly demonstrated, particularly for the strong and damaging gust magnitudes, which appears to be partially due to ERA5 predictor characteristics and the presence of mixed wind climates.

     
    more » « less
  2. Abstract

    Few studies have utilized machine learning techniques to predict or understand the Madden‐Julian oscillation (MJO), a key source of subseasonal variability and predictability. Here, we present a simple framework for real‐time MJO prediction using shallow artificial neural networks (ANNs). We construct two ANN architectures, one deterministic and one probabilistic, that predict a real‐time MJO index using maps of tropical variables. These ANNs make skillful MJO predictions out to ∼18 days in October‐March and ∼11 days in April‐September, outperforming conventional linear models and efficiently capturing aspects of MJO predictability found in more complex, dynamical models. The flexibility and explainability of simple ANN frameworks are highlighted through varying model input and applying ANN explainability techniques that reveal sources and regions important for ANN prediction skill. The accessibility, performance, and efficiency of this simple machine learning framework is more broadly applicable to predict and understand other Earth system phenomena.

     
    more » « less
  3. Abstract

    Previous studies found conflicting results on the importance of temperature and precipitation versus geochemical variables for predicting soil organic carbon (SOC) concentrations and trends with depth, and most utilized linear statistical models. To reconcile the controversy, we used data from 2574 mineral horizons from 675 pits from National Ecological Observatory Network sites across North America, typically collected to 1 m depth. Climate was a fundamental predictor of SOC and played similarly important roles as some geochemical predictors. Yet, this only emerged in the generalized additive mixed model and random forest model and was obscured in the linear mixed model. Relationships between water availability and SOC were strongest in very dry ecosystems and SOC increased most strongly at mean annual temperature < 0°C. In all models, depth, oxalate‐extractable Al (Alox), pH, and exchangeable calcium plus exchangeable magnesium were important while silt + clay, oxalate‐extractable Fe (Feox), and vegetation type were weaker predictors. Climate and pH were independently related to SOC and also interacted with geochemical composition: Feoxand Aloxrelated more strongly to SOC in wet or cold climates. Most predictors had nonlinear threshold relationships with SOC, and a saturating response to increasing reactive metals indicates soils where SOC might be limited by C inputs. We observed a mostly constant relative importance of geochemical and climate predictors of SOC with increasing depth, challenging previous statements. Overall, our findings challenge the notion that climate is redundant after accounting for geochemistry and demonstrate that considering their nonlinearities and interactions improves spatial predictions of SOC.

     
    more » « less
  4. Abstract

    Gridded air temperature data are required in various fields such as ecological modeling, weather forecasting, and surface energy balance assessment. In this work, a piecewise multiple linear regression model is used to produce high‐resolution (250 m) daily maximum (Tmax), minimum (Tmin), and mean (Tmean) near‐surface air temperature maps for the State of Hawaiʻi for a 32‐year period (1990–2021). Multiple meteorological and geographical variables such as the elevation, daily rainfall, coastal distance index, leaf area index, albedo, topographic position index, and wind speed are independently tested to determine the most well‐suited predictor variables for optimal model performance. During the mapping process, input data scarcity is addressed first by gap‐filling critical stations at high elevation using a predetermined linear relationship with other strongly‐correlated stations, and second, by supplementing the training dataset with station data from neighboring islands. Despite the numerous covariates physically linked to temperature, the most parsimonious model selection uses elevation as its sole predictor, and the inclusion of the additional variables results in increased cross‐validation errors. The mean absolute error of resultant estimatedTmaxandTminmaps over the Hawaiian Islands from 1990 to 2021 is 1.7°C and 1.3°C, respectively. Corresponding bias values are 0.01°C and −0.13°C, respectively for the same variables. Overall, the results show the proposed methodology can robustly generate daily air temperature maps from point‐scale measurements over complex topography.

     
    more » « less
  5. Artificial Neural Networks (ANNs) are currently being used as function approximators in many state-of-the-art Reinforcement Learning (RL) algorithms. Spiking Neural Networks (SNNs) have been shown to drastically reduce the energy consumption of ANNs by encoding information in sparse temporal binary spike streams, hence emulating the communication mechanism of biological neurons. Due to their low energy consumption, SNNs are considered to be important candidates as co-processors to be implemented in mobile devices. In this work, the use of SNNs as stochastic policies is explored under an energy-efficient first-to-spike action rule, whereby the action taken by the RL agent is determined by the occurrence of the first spike among the output neurons. A policy gradient-based algorithm is derived considering a Generalized Linear Model (GLM) for spiking neurons. Experimental results demonstrate the capability of online trained SNNs as stochastic policies to gracefully trade energy consumption, as measured by the number of spikes, and control performance. Significant gains are shown as compared to the standard approach of converting an offline trained ANN into an SNN. 
    more » « less