skip to main content


Title: Understanding Predictability of Daily Southeast U.S. Precipitation Using Explainable Machine Learning
Abstract

We investigate the predictability of the sign of daily southeastern U.S. (SEUS) precipitation anomalies associated with simultaneous predictors of large-scale climate variability using machine learning models. Models using index-based climate predictors and gridded fields of large-scale circulation as predictors are utilized. Logistic regression (LR) and fully connected neural networks using indices of climate phenomena as predictors produce neither accurate nor reliable predictions, indicating that the indices themselves are not good predictors. Using gridded fields as predictors, an LR and convolutional neural network (CNN) are more accurate than the index-based models. However, only the CNN can produce reliable predictions that can be used to identify forecasts of opportunity. Using explainable machine learning we identify which variables and grid points of the input fields are most relevant for confident and correct predictions in the CNN. Our results show that the local circulation is most important as represented by maximum relevance of 850-hPa geopotential heights and zonal winds to making skillful, high-probability predictions. Corresponding composite anomalies identify connections with El Niño–Southern Oscillation during winter and the Atlantic multidecadal oscillation and North Atlantic subtropical high during summer.

 
more » « less
Award ID(s):
2029260
NSF-PAR ID:
10382614
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Artificial Intelligence for the Earth Systems
Volume:
1
Issue:
4
ISSN:
2769-7525
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    In recent years, harmful algal blooms (HABs) have increased in their severity and extent in many parts of the world and pose serious threats to local aquaculture, fisheries, and public health. In many cases, the mechanisms triggering and regulating HAB events remain poorly understood. Using underwater microscopy and Residual Neural Network (ResNet‐18) to taxonomically classify imaged organisms, we developed a daily abundance record of four potentially harmful algae (Akashiwo sanguinea,Chattonellaspp.,Dinophysisspp., andLingulodinium polyedra) and major grazer groups (ciliates, copepod nauplii, and copepods) from August 2017 to November 2020 at Scripps Institution of Oceanography pier, a coastal location in the Southern California Bight. Random Forest algorithms were used to identify the optimal combination of environmental and ecological variables that produced the most accurate abundance predictions for each taxon. We developed models with high prediction accuracy forA. sanguinea(),Chattonellaspp. (), andL. polyedra(), whereas models forDinophysisspp. showed lower prediction accuracy (). Offshore nutricline depth and indices describing climate variability, including El Niño Southern Oscillation, Pacific Decadal Oscillation, and North Pacific Gyre Oscillation, that influence regional‐scale ocean circulation patterns and environmental conditions, were key predictor variables for these HAB taxa. These metrics of regional‐scale processes were generally better predictors of HAB taxa abundances at this coastal location than the in situ environmental measurements. Ciliate abundance was an important predictor ofChattonellaandDinophysisspp., but not ofA. sanguineaandL. polyedra. Our findings indicate that combining regional and local environmental factors with microzooplankton populations dynamics can improve real‐time HAB abundance forecasts.

     
    more » « less
  2. Abstract

    Heatwaves are extreme near-surface temperature events that can have substantial impacts on ecosystems and society. Early warning systems help to reduce these impacts by helping communities prepare for hazardous climate-related events. However, state-of-the-art prediction systems can often not make accurate forecasts of heatwaves more than two weeks in advance, which are required for advance warnings. We therefore investigate the potential of statistical and machine learning methods to understand and predict central European summer heatwaves on time scales of several weeks. As a first step, we identify the most important regional atmospheric and surface predictors based on previous studies and supported by a correlation analysis: 2-m air temperature, 500-hPa geopotential, precipitation, and soil moisture in central Europe, as well as Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream. Based on these predictors, we apply machine learning methods to forecast two targets: summer temperature anomalies and the probability of heatwaves for 1–6 weeks lead time at weekly resolution. For each of these two target variables, we use both a linear and a random forest model. The performance of these statistical models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. For lead times longer than two weeks, our machine learning models compete with the ensemble mean of the European Centre for Medium-Range Weather Forecast’s hindcast system. We thus show that machine learning can help improve subseasonal forecasts of summer temperature anomalies and heatwaves.

    Significance Statement

    Heatwaves (prolonged extremely warm temperatures) cause thousands of fatalities worldwide each year. These damaging events are becoming even more severe with climate change. This study aims to improve advance predictions of summer heatwaves in central Europe by using statistical and machine learning methods. Machine learning models are shown to compete with conventional physics-based models for forecasting heatwaves more than two weeks in advance. These early warnings can be used to activate effective and timely response plans targeting vulnerable communities and regions, thereby reducing the damage caused by heatwaves.

     
    more » « less
  3. Abstract

    Radiative transfer (RT) is a crucial but computationally expensive process in numerical weather/climate prediction. We develop neural networks (NN) to emulate a common RT parameterization called the Rapid Radiative Transfer Model (RRTM), with the goal of creating a faster parameterization for the Global Forecast System (GFS) v16. In previous work we emulated a highly simplified version of the shortwave RRTM only—excluding many predictor variables, driven by Rapid Refresh forecasts interpolated to a consistent height grid, using only 30 sites in the Northern Hemisphere. In this work we emulate the full shortwave and longwave RRTM—with all predictor variables, driven by GFSv16 forecasts on the native pressure–sigma grid, using data from around the globe. We experiment with NNs of widely varying complexity, including the U-net++ and U-net3+ architectures and deeply supervised training, designed to ensure realistic and accurate structure in gridded predictions. We evaluate the optimal shortwave NN and optimal longwave NN in great detail—as a function of geographic location, cloud regime, and other weather types. Both NNs produce extremely reliable heating rates and fluxes. The shortwave NN has an overall RMSE/MAE/bias of 0.14/0.08/−0.002 K day−1for heating rate and 6.3/4.3/−0.1 W m−2for net flux. Analogous numbers for the longwave NN are 0.22/0.12/−0.0006 K day−1and 1.07/0.76/+0.01 W m−2. Both NNs perform well in nearly all situations, and the shortwave (longwave) NN is 7510 (90) times faster than the RRTM. Both will soon be tested online in the GFSv16.

    Significance Statement

    Radiative transfer is an important process for weather and climate. Accurate radiative transfer models exist, such as the RRTM, but these models are computationally slow. We develop neural networks (NNs), a type of machine learning model that is often computationally fast after training, to mimic the RRTM. We wish to accelerate the RRTM by orders of magnitude without sacrificing much accuracy. We drive both the NNs and RRTM with data from the GFSv16, an operational weather model, using locations around the globe during all seasons. We show that the NNs are highly accurate and much faster than the RRTM, which suggests that the NNs could be used to solve radiative transfer inside the GFSv16.

     
    more » « less
  4. null (Ed.)
    Abstract Precipitation is one of the most difficult variables to estimate using large-scale predictors. Over South America (SA), this task is even more challenging, given the complex topography of the Andes. Empirical–statistical downscaling (ESD) models can be used for this purpose, but such models, applicable for all of SA, have not yet been developed. To address this issue, we construct an ESD model using multiple-linear-regression techniques for the period 1982–2016 that is based on large-scale circulation indices representing tropical Pacific Ocean, Atlantic Ocean, and South American climate variability, to estimate austral summer [December–February (DJF)] precipitation over SA. Statistical analyses show that the ESD model can reproduce observed precipitation anomalies over the tropical Andes (Ecuador, Colombia, Peru, and Bolivia), the eastern equatorial Amazon basin, and the central part of the western Argentinian Andes. On a smaller scale, the ESD model also shows good results over the Western Cordillera of the Peruvian Andes. The ESD model reproduces anomalously dry conditions over the eastern equatorial Amazon and the wet conditions over southeastern South America (SESA) during the three extreme El Niños: 1982/83, 1997/98, and 2015/16. However, it overestimates the observed intensities over SESA. For the central Peruvian Andes as a case study, results further show that the ESD model can correctly reproduce DJF precipitation anomalies over the entire Mantaro basin during the three extreme El Niño episodes. Moreover, multiple experiments with varying predictor combinations of the ESD model corroborate the hypothesis that the interaction between the South Atlantic convergence zone and the equatorial Atlantic Ocean provoked the Amazon drought in 2015/16. 
    more » « less
  5. Abstract

    Substantial marine, terrestrial, and atmospheric changes have occurred over the Greenland region during the last century. Several studies have documented record‐levels of Greenland Ice Sheet (GrIS) summer melt extent during the 2000s and 2010s, but relatively little work has been carried out to assess regional climatic changes in other seasons. Here, we focus on the less studied cold‐season (i.e., autumn and winter) climate, tracing the long‐term (1873–2013) variability of Greenland's air temperatures through analyses of coastal observations and model‐derived outlet glacier series and their linkages with North Atlantic sea ice, sea surface temperature (SST), and atmospheric circulation indices. Through a statistical framework, large amounts of west and south Greenland temperature variance (up tor2 ~ 50%) can be explained by the seasonally‐contemporaneous combination of the Greenland Blocking Index (GBI) and the North Atlantic Oscillation (NAO; hereafter the combination of GBI and NAO is termed GBI). Lagged and concomitant regional sea‐ice concentration (SIC) and the Atlantic Multidecadal Oscillation (AMO) seasonal indices account for small amounts of residual air temperature variance (r2 < ~10%) relative to the GBI. The correlations between GBI and cold‐season temperatures are predominantly positive and statistically‐significant through time, while regional SIC conditions emerge as a significant covariate from the mid‐20th century through the conclusion of the study period. The inclusion of the cold‐season Pacific Decadal Oscillation (PDO) in multivariate analyses bolsters the air temperature variance explained by the North Atlantic regional predictors, suggesting the remote, background climate state is important to long‐term Greenland temperature variability. These findings imply that large‐scale tropospheric circulation has a strong control on surface temperature over Greenland through dynamic and thermodynamic impacts and stress the importance of understanding the evolving two‐way linkages between the North Atlantic marine and atmospheric environment in order to more accurately predict Greenland seasonal climate variability and change through the 21st century.

     
    more » « less