Search for: All records

Award ID contains: 2019758

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Ensemble Extreme Precipitation Forecasts Using Generative Artificial Intelligence

https://doi.org/10.1175/AIES-D-24-0063.1

Sha, Yingkai; Sobash, Ryan A; Gagne, David John (April 2025, Artificial Intelligence for the Earth Systems)

Abstract An ensemble postprocessing method is developed to improve the probabilistic forecasts of extreme precipitation events across the conterminous United States (CONUS). The method combines a 3D vision transformer (ViT) for bias correction with a latent diffusion model (LDM), a generative artificial intelligence (AI) method, to postprocess 6-hourly precipitation ensemble forecasts and produce an enlarged generative ensemble that contains spatiotemporally consistent precipitation trajectories. These trajectories are expected to improve the characterization of extreme precipitation events and offer skillful multiday accumulated and 6-hourly precipitation guidance. The method is tested using the Global Ensemble Forecast System (GEFS) precipitation forecasts out to day 6 and is verified against the Climatology-Calibrated Precipitation Analysis (CCPA) data. Verification results indicate that the method generated skillful ensemble members with improved continuous ranked probabilistic skill scores (CRPSSs) and Brier skill scores (BSSs) over the raw operational GEFS and a multivariate statistical postprocessing baseline. It showed skillful and reliable probabilities for events at extreme precipitation thresholds. Explainability studies were further conducted, which revealed the decision-making process of the method and confirmed its effectiveness on ensemble member generation. This work introduces a novel, generative AI–based approach to address the limitation of small numerical ensembles and the need for larger ensembles to identify extreme precipitation events. Significance StatementWe use a new artificial intelligence (AI) technique to improve extreme precipitation forecasts from a numerical weather prediction ensemble, generating more scenarios that better characterize extreme precipitation events. This AI-generated ensemble improved the accuracy of precipitation forecasts and probabilistic warnings for extreme precipitation events. The study explores AI methods to generate precipitation forecasts and explains the decision-making mechanisms of such AI techniques to prove their effectiveness.
more » « less
Free, publicly-accessible full text available April 1, 2026
Leveraging Co-Production to Bridge Research and Operations in Operational Meteorology

https://doi.org/10.1175/WAF-D-24-0145.1

Harrison, David R; McGovern, Amy; Karstens, Christopher D; Bostrom, Ann; Jirak, Israel L; Marsh, Patrick T (May 2025, Weather and Forecasting)

Abstract The benefits of collaboration between the research and operational communities during the research-to-operations (R2O) process have long been documented in the scientific literature. Operational forecasters have a practiced, expert insight into weather analysis and forecasting but typically lack the time and resources for formal research and development. Conversely, many researchers have the resources, theoretical knowledge, and formal experience to solve complex meteorological challenges but lack an understanding of operation procedures, needs, requirements, and authority necessary to effectively bridge the R2O gap. Collaboration then serves as the most viable strategy to further a better understanding and improved prediction of atmospheric processes via ongoing multi-disciplinary knowledge transfer between the research and operational communities. However, existing R2O processes leave room for improvement when it comes to collaboration throughout a new product’s development cycle. This study assesses the subjective importance of collaboration at various stages of product development via a survey presented to participants of the 2021 Hazardous Weather Testbed Spring Forecasting Experiment. This feedback is then applied to create a proposed new R2O workflow that combines components from existing R2O procedures and modern co-production philosophies.
more » « less
Free, publicly-accessible full text available May 19, 2026
An Assessment of How Domain Experts Evaluate Machine Learning in Operational Meteorology

https://doi.org/10.1175/WAF-D-24-0144.1

Harrison, David R; McGovern, Amy; Karstens, Christopher D; Bostrom, Ann; Demuth, Julie L; Jirak, Israel L; Marsh, Patrick T (March 2025, Weather and Forecasting)

Abstract As an increasing number of machine learning (ML) products enter the research-to-operations (R2O) pipeline, researchers have anecdotally noted a perceived hesitancy by operational forecasters to adopt this relatively new technology. One explanation often cited in the literature is that this perceived hesitancy derives from the complex and opaque nature of ML methods. Because modern ML models are trained to solve tasks by optimizing a potentially complex combination of mathematical weights, thresholds, and nonlinear cost functions, it can be difficult to determine how these models reach a solution from their given input. However, it remains unclear to what degree a model’s transparency may influence a forecaster’s decision to use that model or if that impact differs between ML and more traditional (i.e., non-ML) methods. To address this question, a survey was offered to forecaster and researcher participants attending the 2021 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiment (SFE) with questions about how participants subjectively perceive and compare machine learning products to more traditionally derived products. Results from this study revealed few differences in how participants evaluated machine learning products compared to other types of guidance. However, comparing the responses between operational forecasters, researchers, and academics exposed notable differences in what factors the three groups considered to be most important for determining the operational success of a new forecast product. These results support the need for increased collaboration between the operational and research communities. Significance StatementParticipants of the 2021 Hazardous Weather Testbed Spring Forecasting Experiment were surveyed to assess how machine learning products are perceived and evaluated in operational settings. The results revealed little difference in how machine learning products are evaluated compared to more traditional methods but emphasized the need for explainable product behavior and comprehensive end-user training.
more » « less
Free, publicly-accessible full text available March 1, 2026
A Regimes‐Based Approach to Identifying Seasonal State‐Dependent Prediction Skill

https://doi.org/10.1029/2024JD042917

Shackelford, Kyle; DeMott, Charlotte A; van_Leeuwen, Peter Jan; Barnes, Elizabeth A (April 2025, Journal of Geophysical Research: Atmospheres)

Abstract Subseasonal‐to‐decadal atmospheric prediction skill attained from initial conditions is typically limited by the chaotic nature of the atmosphere. However, for some atmospheric phenomena, prediction skill on subseasonal‐to‐decadal timescales is increased when the initial conditions are in a particular state. In this study, we employ machine learning to identify sea surface temperature (SST) regimes that enhance prediction skill of North Atlantic atmospheric circulation. An ensemble of artificial neural networks is trained to predict anomalous, low‐pass filtered 500 mb height at 7–8 weeks lead using SST. We then use self‐organizing maps (SOMs) constructed from 9 regions within the SST domain to detect state‐dependent prediction skill. SOMs are built using the entire SST time series, and we assess which SOM units feature confident neural network predictions. Four regimes are identified that provide skillful seasonal predictions of 500 mb height. Our findings demonstrate the importance of extratropical decadal SST variability in modulating downstream ENSO teleconnections to the North Atlantic. The methodology presented could aid future forecasting on subseasonal‐to‐decadal timescales.
more » « less
Free, publicly-accessible full text available April 28, 2026
A Comparison of AI Weather Prediction and Numerical Weather Prediction Models for 1–7-Day Precipitation Forecasts

https://doi.org/10.1175/WAF-D-24-0081.1

Radford, Jacob T; Ebert-Uphoff, Imme; Stewart, Jebb Q (April 2025, Weather and Forecasting)

Abstract Pure artificial intelligence (AI)-based weather prediction (AIWP) models have made waves within the scientific community and the media, claiming superior performance to numerical weather prediction (NWP) models. However, these models often lack impactful output variables such as precipitation. One exception is Google DeepMind’s GraphCast model, which became the first mainstream AIWP model to predict precipitation, but performed only limited verification. We present an analysis of the ECMWF’s Integrated Forecasting System (IFS)-initialized (GRAP_IFS) and the NCEP’s Global Forecast System (GFS)-initialized (GRAP_GFS) GraphCast precipitation forecasts over the contiguous United States and compare to results from the GFS and IFS models using 1) grid-based, 2) neighborhood, and 3) object-oriented metrics verified against the fifth major global reanalysis produced by ECMWF (ERA5) and the NCEP/Environmental Modeling Center (EMC) stage IV precipitation analysis datasets. We affirmed that GRAP_GFSand GRAP_IFSperform better than the GFS and IFS in terms of root-mean-square error and stable equitable errors in probability space, but the GFS and IFS precipitation distributions more closely align with the ERA5 and stage IV distributions. Equitable threat score also generally favored GraphCast, particularly for lower accumulation thresholds. Fractions skill score for increasing neighborhood sizes shows greater gains for the GFS and IFS than GraphCast, suggesting the NWP models may have a better handle on intensity but struggle with the location. Object-based verification for GraphCast found positive area biases at low accumulation thresholds and large negative biases at high accumulation thresholds. GRAP_GFSsaw similar performance gains to GRAP_IFSwhen compared to their NWP counterparts, but initializing with the less familiar GFS conditions appeared to lead to an increase in light precipitation. Significance StatementPure artificial intelligence (AI)-based weather prediction (AIWP) has exploded in popularity with promises of better performance and faster run times than numerical weather prediction (NWP) models. However, less attention has been paid to their capability to predict impactful, sensible weather like precipitation, precipitation type, or specific meteorological features. We seek to address this gap by comparing precipitation forecast performance by an AI model called GraphCast to the Global Forecast System (GFS) and the Integrated Forecasting System (IFS) NWP models. While GraphCast does perform better on many verification metrics, it has some limitations for intense precipitation forecasts. In particular, it less frequently predicts intense precipitation events than the GFS or IFS. Overall, this article emphasizes the promise of AIWP while at the same time stresses the need for robust verification by domain experts.
more » « less
Free, publicly-accessible full text available April 1, 2026
Gulf Stream Near Cape Hatteras Modulates Sea Level Variability Along the Southeastern Coast of North America

https://doi.org/10.1029/2024GL112776

Wu, Tianning; He, Ruoying (April 2025, Geophysical Research Letters)

Abstract Studies suggest a strong link between low‐frequency sea level variability in the South Atlantic Bight (SAB) and open ocean dynamics. However, the mechanisms driving this connection remain unclear. By analyzing a high‐resolution, three‐dimensional baroclinic ocean reanalysis, we identify a pathway that links open ocean dynamics to SAB coastal sea level variability through the shelf edge near Cape Hatteras. Gulf Stream meanders in this region induce sea level fluctuations that propagate along the entire SAB shelf. Using an idealized barotropic model, we further demonstrate that topographic waves mediate the propagation of the Gulf Stream signal onto the shelf. Moreover, the Gulf Stream variability is driven by zonal wind stress in the Northwest Atlantic, which is likely modulated by the North Atlantic Oscillation. These findings offer new insights into regional sea level prediction and contribute to broader climate research efforts.
more » « less
Identifying data sources and physical strategies used by neural networks to predict TC rapid intensification

https://doi.org/10.1175/WAF-D-24-0166.1

Lagerquist, Ryan; Knaff, John A; Slocum, Christopher J; Musgrave, Kate; Ebert-Uphoff, Imme (May 2025, Weather and Forecasting)

Abstract The rapid intensification (RI) of tropical cyclones (TC), defined here as an intensity increase of ≥ 30 kt in 24 hours, is a difficult but important forecasting problem. Operational RI forecasts have considerably improved since the late 2000s, largely thanks to better statistical models, including machine learning (ML). Most ML applications use scalars from the Statistical Hurricane Intensity Prediction Scheme (SHIPS) development dataset as predictors, describing the TC history, near-TC environment, and satellite presentation of the TC. More recent ML applications use convolutional neural networks (CNN), which can ingest full satellite images (or time series of images) and freely “decide” which spatiotemporal features are important for RI. However, two questions remain unanswered: (1) Does image convolution significantly improve RI skill? (2) What strategies do CNNs use for RI prediction – and can we gain new insights from these strategies? We use an ablation experiment to answer the first question and explainable artificial intelligence (XAI) to answer the second. Convolution leads to only a small performance gain, likely because, as revealed by XAI, the CNN’s main strategy uses image features already well described in scalar predictors used by pre-existing RI models. This work makes three additional contributions to the literature: (1) NNs with SHIPS data outperform pre-existing models in some aspects; (2) NNs provide well calibrated uncertainty quantification (UQ), while pre-existing models have no UQ; (3) the NN without SHIPS data performs surprisingly well and is fairly independent of pre-existing models, suggesting its potential value in an operational ensemble.
more » « less
Free, publicly-accessible full text available May 15, 2026
The Observed Availability of Data and Code in Earth Science and Artificial Intelligence

https://doi.org/10.1175/BAMS-D-24-0147.1

Jones, Erin A; McClung, Brandon; Fawad, Hadi; McGovern, Amy (May 2025, Bulletin of the American Meteorological Society)

Abstract As the use of artificial intelligence (AI) has grown exponentially across a wide variety of science applications, it has become clear that it is critical to share data and code to facilitate reproducibility and innovation. AMS recently adopted the requirement that all papers include an availability statement. However, there is no requirement to ensure that the data and code are actually freely accessible during and after publication. Studies show that without this requirement, data is openly available in about a third to a half of journal articles. In this work, we surveyed two AMS journals, Artificial Intelligence for the Earth Systems (AIES) and Monthly Weather Review (MWR), and two non-AMS journals. These journals varied in primary topic foci, publisher, and requirement of an availability statement. We examined the extent to which data and code are stated to be available in all four journals, if readers could easily access the data and code, and what common justifications were provided for articles without open data or code. Our analysis found that roughly 75% of all articles that produced data and had an availability statement made at least some of their data openly available. Code was made openly available less frequently in three out of the four journals examined. Access was inhibited to data or code in approximately 15% of availability statement that contained at least one link. Finally, the most common justifications for not making data or code openly available referenced dataset size and restrictions of availability from non-co-author entities.
more » « less
Free, publicly-accessible full text available May 7, 2026
FrontFinder AI: Efficient Identification of Frontal Boundaries over the Continental United States and NOAA’s Unified Surface Analysis Domain Using the UNET3+ Model Architecture

https://doi.org/10.1175/AIES-D-24-0043.1

Justin, Andrew D; McGovern, Amy; Allen, John T (January 2025, Artificial Intelligence for the Earth Systems)

Abstract FrontFinder artificial intelligence (AI) is a novel machine learning algorithm trained to detect cold, warm, stationary, and occluded fronts and drylines. Fronts are associated with many high-impact weather events around the globe. Frontal analysis is still primarily done by human forecasters, often implementing their own rules and criteria for determining front positions. Such techniques result in multiple solutions by different forecasters when given identical sets of data. Numerous studies have attempted to automate frontal analysis through numerical frontal analysis. In recent years, machine learning algorithms have gained more popularity in meteorology due to their ability to learn complex relationships. Our algorithm was able to reproduce three-quarters of forecaster-drawn fronts over CONUS and NOAA’s unified surface analysis domain on independent testing datasets. We applied permutation studies, an explainable artificial intelligence method, to identify the importance of each variable for each front type. The permutation studies showed that the most “important” variables for detecting fronts are consistent with observed processes in the evolution of frontal boundaries. We applied the model to an extratropical cyclone over the central United States to see how the model handles the occlusion process, with results showing that the model can resolve the early stages of occluded fronts wrapping around cyclone centers. While our algorithm is not intended to replace human forecasters, the model can streamline operational workflows by providing efficient frontal boundary identification guidance. FrontFinder has been deployed operationally at NOAA’s Weather Prediction Center. Significance StatementFrontal boundaries drive many high-impact weather events worldwide. Identification and classification of frontal boundaries is necessary to anticipate changing weather conditions; however, frontal analysis is still mainly performed by human forecasters, leaving room for subjective interpretations during the frontal analysis process. We have introduced a novel machine learning method that identifies cold, warm, stationary, and occluded fronts and drylines without the need for high-end computational resources. This algorithm can be used as a tool to expedite the frontal analysis process by ingesting real-time data in operational environments.
more » « less
Free, publicly-accessible full text available January 1, 2026
The influence of correlated features on neural network attribution methods in geoscience

https://doi.org/10.1017/eds.2025.19

Krell, Evan; Mamalakis, Antonios; King, Scott A; Tissot, Philippe; Ebert-Uphoff, Imme (January 2025, Environmental Data Science)

Abstract Artificial neural networks are increasingly used for geophysical modeling to extract complex nonlinear patterns from geospatial data. However, it is difficult to understand how networks make predictions, limiting trust in the model, debugging capacity, and physical insights. EXplainable Artificial Intelligence (XAI) techniques expose how models make predictions, but XAI results may be influenced by correlated features. Geospatial data typically exhibit substantial autocorrelation. With correlated input features, learning methods can produce many networks that achieve very similar performance (e.g., arising from different initializations). Since the networks capture different relationships, their attributions can vary. Correlated features may also cause inaccurate attributions because XAI methods typically evaluate isolated features, whereas networks learn multifeature patterns. Few studies have quantitatively analyzed the influence of correlated features on XAI attributions. We use a benchmark framework of synthetic data with increasingly strong correlation, for which the ground truth attribution is known. For each dataset, we train multiple networks and compare XAI-derived attributions to the ground truth. We show that correlation may dramatically increase the variance of the derived attributions, and investigate the cause of the high variance: is it because different trained networks learn highly different functions or because XAI methods become less faithful in the presence of correlation? Finally, we show XAI applied to superpixels, instead of single grid cells, substantially decreases attribution variance. Our study is the first to quantify the effects of strong correlation on XAI, to investigate the reasons that underlie these effects, and to offer a promising way to address them.
more » « less
Free, publicly-accessible full text available January 1, 2026

« Prev Next »