skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Generative Ensemble Deep Learning Severe Weather Prediction from a Deterministic Convection-Allowing Model
Abstract An ensemble postprocessing method is developed for the probabilistic prediction of severe weather (tornadoes, hail, and wind gusts) over the conterminous United States (CONUS). The method combines conditional generative adversarial networks (CGANs), a type of deep generative model, with a convolutional neural network (CNN) to postprocess convection-allowing model (CAM) forecasts. The CGANs are designed to create synthetic ensemble members from deterministic CAM forecasts, and their outputs are processed by the CNN to estimate the probability of severe weather. The method is tested using High-Resolution Rapid Refresh (HRRR) 1–24-h forecasts as inputs and Storm Prediction Center (SPC) severe weather reports as targets. The method produced skillful predictions with up to 20% Brier skill score (BSS) increases compared to other neural-network-based reference methods using a testing dataset of HRRR forecasts in 2021. For the evaluation of uncertainty quantification, the method is overconfident but produces meaningful ensemble spreads that can distinguish good and bad forecasts. The quality of CGAN outputs is also evaluated. Results show that the CGAN outputs behave similarly to a numerical ensemble; they preserved the intervariable correlations and the contribution of influential predictors as in the original HRRR forecasts. This work provides a novel approach to postprocess CAM output using neural networks that can be applied to severe weather prediction. Significance StatementWe use a new machine learning (ML) technique to generate probabilistic forecasts of convective weather hazards, such as tornadoes and hailstorms, with the output from high-resolution numerical weather model forecasts. The new ML system generates an ensemble of synthetic forecast fields from a single forecast, which are then used to train ML models for convective hazard prediction. Using this ML-generated ensemble for training leads to improvements of 10%–20% in severe weather forecast skills compared to using other ML algorithms that use only output from the single forecast. This work is unique in that it explores the use of ML methods for producing synthetic forecasts of convective storm events and using these to train ML systems for high-impact convective weather prediction.  more » « less
Award ID(s):
2019758
PAR ID:
10508581
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Artificial Intelligence for the Earth Systems
Volume:
3
Issue:
2
ISSN:
2769-7525
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract A primary goal of the National Oceanic and Atmospheric Administration Warn-on-Forecast (WoF) project is to provide rapidly updating probabilistic guidance to human forecasters for short-term (e.g., 0–3 h) severe weather forecasts. Postprocessing is required to maximize the usefulness of probabilistic guidance from an ensemble of convection-allowing model forecasts. Machine learning (ML) models have become popular methods for postprocessing severe weather guidance since they can leverage numerous variables to discover useful patterns in complex datasets. In this study, we develop and evaluate a series of ML models to produce calibrated, probabilistic severe weather guidance from WoF System (WoFS) output. Our dataset includes WoFS ensemble forecasts available every 5 min out to 150 min of lead time from the 2017–19 NOAA Hazardous Weather Testbed Spring Forecasting Experiments (81 dates). Using a novel ensemble storm-track identification method, we extracted three sets of predictors from the WoFS forecasts: intrastorm state variables, near-storm environment variables, and morphological attributes of the ensemble storm tracks. We then trained random forests, gradient-boosted trees, and logistic regression algorithms to predict which WoFS 30-min ensemble storm tracks will overlap a tornado, severe hail, and/or severe wind report. To provide rigorous baselines against which to evaluate the skill of the ML models, we extracted the ensemble probabilities of hazard-relevant WoFS variables exceeding tuned thresholds from each ensemble storm track. The three ML algorithms discriminated well for all three hazards and produced more reliable probabilities than the baseline predictions. Overall, the results suggest that ML-based postprocessing of dynamical ensemble output can improve short-term, storm-scale severe weather probabilistic guidance. 
    more » « less
  2. Abstract While convective storm mode is explicitly depicted in convection-allowing model (CAM) output, subjectively diagnosing mode in large volumes of CAM forecasts can be burdensome. In this work, four machine learning (ML) models were trained to probabilistically classify CAM storms into one of three modes: supercells, quasi-linear convective systems, and disorganized convection. The four ML models included a dense neural network (DNN), logistic regression (LR), a convolutional neural network (CNN) and semi-supervised CNN-Gaussian mixture model (GMM). The DNN, CNN, and LR were trained with a set of hand-labeled CAM storms, while the semi-supervised GMM used updraft helicity and storm size to generate clusters which were then hand labeled. When evaluated using storms withheld from training, the four classifiers had similar ability to discriminate between modes, but the GMM had worse calibration. The DNN and LR had similar objective performance to the CNN, suggesting that CNN-based methods may not be needed for mode classification tasks. The mode classifications from all four classifiers successfully approximated the known climatology of modes in the U.S., including a maximum in supercell occurrence in the U.S. Central Plains. Further, the modes also occurred in environments recognized to support the three different storm morphologies. Finally, storm mode provided useful information about hazard type, e.g., storm reports were most likely with supercells, further supporting the efficacy of the classifiers. Future applications, including the use of objective CAM mode classifications as a novel predictor in ML systems, could potentially lead to improved forecasts of convective hazards. 
    more » « less
  3. Abstract We present an overview of recent work on using artificial intelligence (AI)/machine learning (ML) techniques for forecasting convective weather and its associated hazards, including tornadoes, hail, wind, and lightning. These high-impact phenomena globally cause both massive property damage and loss of life, yet they are very challenging to forecast. Given the recent explosion in developing ML techniques across the weather spectrum and the fact that the skillful prediction of convective weather has immediate societal benefits, we present a thorough review of the current state of the art in AI and ML techniques for convective hazards. Our review includes both traditional approaches, including support vector machines and decision trees, as well as deep learning approaches. We highlight the challenges in developing ML approaches to forecast these phenomena across a variety of spatial and temporal scales. We end with a discussion of promising areas of future work for ML for convective weather, including a discussion of the need to create trustworthy AI forecasts that can be used for forecasters in real time and the need for active cross-sector collaboration on testbeds to validate ML methods in operational situations. Significance StatementWe provide an overview of recent machine learning research in predicting hazards from thunderstorms, specifically looking at lightning, wind, hail, and tornadoes. These hazards kill people worldwide and also destroy property and livestock. Improving the prediction of these events in both the local space as well as globally can save lives and property. By providing this review, we aim to spur additional research into developing machine learning approaches for convective hazard prediction. 
    more » « less
  4. null (Ed.)
    Abstract As lightning-detection records lengthen and the efficiency of severe weather reporting increases, more accurate climatologies of convective hazards can be constructed. In this study we aggregate flashes from the National Lightning Detection Network (NLDN) and Arrival Time Difference long-range lightning detection network (ATDnet) with severe weather reports from the European Severe Weather Database (ESWD) and Storm Prediction Center (SPC) Storm Data on a common grid of 0.25° and 1-h steps. Each year approximately 75–200 thunderstorm hours occur over the southwestern, central, and eastern United States, with a peak over Florida (200–250 h). The activity over the majority of Europe ranges from 15 to 100 h, with peaks over Italy and mountains (Pyrenees, Alps, Carpathians, Dinaric Alps; 100–150 h). The highest convective activity over continental Europe occurs during summer and over the Mediterranean during autumn. The United States peak for tornadoes and large hail reports is in spring, preceding the maximum of lightning and severe wind reports by 1–2 months. Convective hazards occur typically in the late afternoon, with the exception of the Midwest and Great Plains, where mesoscale convective systems shift the peak lightning threat to the night. The severe wind threat is delayed by 1–2 h compared to hail and tornadoes. The fraction of nocturnal lightning over land ranges from 15% to 30% with the lowest values observed over Florida and mountains (~10%). Wintertime lightning shares the highest fraction of severe weather. Compared to Europe, extreme events are considerably more frequent over the United States, with maximum activity over the Great Plains. However, the threat over Europe should not be underestimated, as severe weather outbreaks with damaging winds, very large hail, and significant tornadoes occasionally occur over densely populated areas. 
    more » « less
  5. On average, modern numerical weather prediction forecasts for daily tornado frequency exhibit no skill beyond day 10. However, in this extended-range lead window, there are particular model cycles that have exceptionally high forecast skill for tornadoes because of their ability to correctly simulate the future synoptic pattern. Here, model initial conditions that produced a more skillful forecast for tornadoes over the United States were exploited while also highlighting potential causes for low-skill cycles within the Global Ensemble Forecasting System, version 12 (GEFSv12). There were 88 high-skill and 91 low-skill forecasts in which the verifying day-10 synoptic pattern for tornado conditions revealed a western U.S. thermal trough and an eastern U.S. thermal ridge, a favorable configuration for tornadic storm occurrence. Initial conditions for high skill forecasts tended to exhibit warmer sea surface temperatures throughout the tropical Pacific Ocean and Gulf of Mexico, an active Madden–Julian oscillation, and significant modulation of Earth-relative atmospheric angular momentum. Low-skill forecasts were often initialized during La Niña and negative Pacific decadal oscillation conditions. Significant atmospheric blocking over eastern Russia—in which the GEFSv12 overforecast the duration and characteristics of the downstream flow—was a common physical process associated with low-skill forecasts. This work helps to increase our understanding of the common causes of high- or low-skill extended-range tornado forecasts and could serve as a helpful tool for operational forecasters. 
    more » « less