skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Measuring Sharpness of AI-Generated Meteorological Imagery
Abstract AI-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as Generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verification metrics. To navigate any trade-off between sharpness and other performance metrics it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian Blur Equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery (GREMLIN) and an AI-based global weather forecasting model (GraphCast).  more » « less
Award ID(s):
2425735
PAR ID:
10614981
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Artificial Intelligence for the Earth Systems
ISSN:
2769-7525
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Pure artificial intelligence (AI)-based weather prediction (AIWP) models have made waves within the scientific community and the media, claiming superior performance to numerical weather prediction (NWP) models. However, these models often lack impactful output variables such as precipitation. One exception is Google DeepMind’s GraphCast model, which became the first mainstream AIWP model to predict precipitation, but performed only limited verification. We present an analysis of the ECMWF’s Integrated Forecasting System (IFS)-initialized (GRAPIFS) and the NCEP’s Global Forecast System (GFS)-initialized (GRAPGFS) GraphCast precipitation forecasts over the contiguous United States and compare to results from the GFS and IFS models using 1) grid-based, 2) neighborhood, and 3) object-oriented metrics verified against the fifth major global reanalysis produced by ECMWF (ERA5) and the NCEP/Environmental Modeling Center (EMC) stage IV precipitation analysis datasets. We affirmed that GRAPGFSand GRAPIFSperform better than the GFS and IFS in terms of root-mean-square error and stable equitable errors in probability space, but the GFS and IFS precipitation distributions more closely align with the ERA5 and stage IV distributions. Equitable threat score also generally favored GraphCast, particularly for lower accumulation thresholds. Fractions skill score for increasing neighborhood sizes shows greater gains for the GFS and IFS than GraphCast, suggesting the NWP models may have a better handle on intensity but struggle with the location. Object-based verification for GraphCast found positive area biases at low accumulation thresholds and large negative biases at high accumulation thresholds. GRAPGFSsaw similar performance gains to GRAPIFSwhen compared to their NWP counterparts, but initializing with the less familiar GFS conditions appeared to lead to an increase in light precipitation. Significance StatementPure artificial intelligence (AI)-based weather prediction (AIWP) has exploded in popularity with promises of better performance and faster run times than numerical weather prediction (NWP) models. However, less attention has been paid to their capability to predict impactful, sensible weather like precipitation, precipitation type, or specific meteorological features. We seek to address this gap by comparing precipitation forecast performance by an AI model called GraphCast to the Global Forecast System (GFS) and the Integrated Forecasting System (IFS) NWP models. While GraphCast does perform better on many verification metrics, it has some limitations for intense precipitation forecasts. In particular, it less frequently predicts intense precipitation events than the GFS or IFS. Overall, this article emphasizes the promise of AIWP while at the same time stresses the need for robust verification by domain experts. 
    more » « less
  2. This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols. 
    more » « less
  3. This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols. 
    more » « less
  4. This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols. 
    more » « less
  5. Abstract The development of deep learning (DL) weather forecasting models has made rapid progress and achieved comparable or better skill than traditional Numerical Weather prediction (NWP) models, which are generally computationally intensive. However, applications of these DL models have yet to be fully explored, including for severe convective events. We evaluate the DL model Pangu‐Weather in forecasting tornadic environments with one‐day lead times using convective available potential energy (CAPE), 0–6 bulk wind difference (BWD6), and 0–3 km storm‐relative helicity (SRH3). We also compare its performance to the National Centers for Environmental Prediction (NCEP)'s Global Forecast System (GFS), a traditional NWP model. Pangu‐Weather generally outperforms GFS in predicting BWD6 and SRH3 at the closest grid point and hour of the storm report. However, Pangu‐Weather tends to underpredict the maximum values of all convective parameters in the 1–2 hr before the storm across the surrounding grid points compared to the GFS. 
    more » « less