Abstract Pure artificial intelligence (AI)-based weather prediction (AIWP) models have made waves within the scientific community and the media, claiming superior performance to numerical weather prediction (NWP) models. However, these models often lack impactful output variables such as precipitation. One exception is Google DeepMind’s GraphCast model, which became the first mainstream AIWP model to predict precipitation, but performed only limited verification. We present an analysis of the ECMWF’s Integrated Forecasting System (IFS)-initialized (GRAPIFS) and the NCEP’s Global Forecast System (GFS)-initialized (GRAPGFS) GraphCast precipitation forecasts over the contiguous United States and compare to results from the GFS and IFS models using 1) grid-based, 2) neighborhood, and 3) object-oriented metrics verified against the fifth major global reanalysis produced by ECMWF (ERA5) and the NCEP/Environmental Modeling Center (EMC) stage IV precipitation analysis datasets. We affirmed that GRAPGFSand GRAPIFSperform better than the GFS and IFS in terms of root-mean-square error and stable equitable errors in probability space, but the GFS and IFS precipitation distributions more closely align with the ERA5 and stage IV distributions. Equitable threat score also generally favored GraphCast, particularly for lower accumulation thresholds. Fractions skill score for increasing neighborhood sizes shows greater gains for the GFS and IFS than GraphCast, suggesting the NWP models may have a better handle on intensity but struggle with the location. Object-based verification for GraphCast found positive area biases at low accumulation thresholds and large negative biases at high accumulation thresholds. GRAPGFSsaw similar performance gains to GRAPIFSwhen compared to their NWP counterparts, but initializing with the less familiar GFS conditions appeared to lead to an increase in light precipitation. Significance StatementPure artificial intelligence (AI)-based weather prediction (AIWP) has exploded in popularity with promises of better performance and faster run times than numerical weather prediction (NWP) models. However, less attention has been paid to their capability to predict impactful, sensible weather like precipitation, precipitation type, or specific meteorological features. We seek to address this gap by comparing precipitation forecast performance by an AI model called GraphCast to the Global Forecast System (GFS) and the Integrated Forecasting System (IFS) NWP models. While GraphCast does perform better on many verification metrics, it has some limitations for intense precipitation forecasts. In particular, it less frequently predicts intense precipitation events than the GFS or IFS. Overall, this article emphasizes the promise of AIWP while at the same time stresses the need for robust verification by domain experts.
more »
« less
Exploratory Precipitation Metrics: Spatiotemporal Characteristics, Process-Oriented, and Phenomena-Based
Abstract Precipitation sustains life and supports human activities, making its prediction one of the most societally relevant challenges in weather and climate modeling. Limitations in modeling precipitation underscore the need for diagnostics and metrics to evaluate precipitation in simulations and predictions. While routine use of basic metrics is important for documenting model skill, more sophisticated diagnostics and metrics aimed at connecting model biases to their sources and revealing precipitation characteristics relevant to how model precipitation is used are critical for improving models and their uses. This paper illustrates examples of exploratory diagnostics and metrics including 1) spatiotemporal characteristics metrics such as diurnal variability, probability of extremes, duration of dry spells, spectral characteristics, and spatiotemporal coherence of precipitation; 2) process-oriented metrics based on the rainfall–moisture coupling and temperature–water vapor environments of precipitation; and 3) phenomena-based metrics focusing on precipitation associated with weather phenomena including low pressure systems, mesoscale convective systems, frontal systems, and atmospheric rivers. Together, these diagnostics and metrics delineate the multifaceted and multiscale nature of precipitation, its relations with the environments, and its generation mechanisms. The metrics are applied to historical simulations from phases 5 and 6 of the Coupled Model Intercomparison Project. Models exhibit diverse skill as measured by the suite of metrics, with very few models consistently ranked as top or bottom performers compared to other models in multiple metrics. Analysis of model skill across metrics and models suggests possible relationships among subsets of metrics, motivating the need for more systematic analysis to understand model biases for informing model development.
more »
« less
- Award ID(s):
- 1936810
- PAR ID:
- 10393329
- Publisher / Repository:
- American Meteorological Society
- Date Published:
- Journal Name:
- Journal of Climate
- Volume:
- 35
- Issue:
- 12
- ISSN:
- 0894-8755
- Format(s):
- Medium: X Size: p. 3659-3686
- Size(s):
- p. 3659-3686
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose a set of MJO teleconnection diagnostics that enables an objective evaluation of model simulations, a fair model-to-model comparison, and a consistent tracking of model improvement. Various skill metrics are derived from teleconnection diagnostics including five performance-based metrics that characterize the pattern, amplitude, east–west position, persistence, and consistency of MJO teleconnections and additional two process-oriented metrics that are designed to characterize the location and intensity of the anomalous Rossby wave source (RWS). The proposed teleconnection skill metrics are used to compare the characteristics of boreal winter MJO teleconnections (500-hPa geopotential height anomaly) over the Pacific–North America (PNA) region in 29 global climate models (GCMs). The results show that current GCMs generally produce MJO teleconnections that are stronger, more persistent, and extend too far to the east when compared to those observed in reanalysis. In general, models simulate more realistic teleconnection patterns when the MJO is in phases 2–3 or phases 7–8, which are characterized by a dipole convection pattern over the Indian Ocean and western to central Pacific. The higher model skill for phases 2, 7, and 8 may be due to these phases producing more consistent teleconnection patterns between individual MJO events than other phases, although the consistency is lower in most models than observed. Models that simulate realistic RWS patterns better reproduce MJO teleconnection patterns.more » « less
-
null (Ed.)Abstract Machine-learning-based methods that identify drought in three-dimensional space–time are applied to climate model simulations and tree-ring-based reconstructions of hydroclimate over the Northern Hemisphere extratropics for the past 1000 years, as well as twenty-first-century projections. Analyzing reconstructed and simulated drought in this context provides a paleoclimate constraint on the spatiotemporal characteristics of simulated droughts. Climate models project that there will be large increases in the persistence and severity of droughts over the coming century, but with little change in their spatial extent. Nevertheless, climate models exhibit biases in the spatiotemporal characteristics of persistent and severe droughts over parts of the Northern Hemisphere. We use the paleoclimate record and results from a linear inverse modeling-based framework to conclude that climate models underestimate the range of potential future hydroclimate states. Complicating this picture, however, are divergent changes in the characteristics of persistent and severe droughts when quantified using different hydroclimate metrics. Collectively our results imply that these divergent responses and the aforementioned biases must be better understood if we are to increase confidence in future hydroclimate projections. Importantly, the novel framework presented herein can be applied to other climate features to robustly describe their spatiotemporal characteristics and provide constraints on future changes to those characteristics.more » « less
-
Hydroclimate and terrestrial hydrology greatly influence the local community, ecosystem, and economy in Alaska and Yukon River Basin. A high‐resolution simulation of the historical climate in Alaska can provide an important benchmark for climate change studies. In this study, we utilized the Regional Arctic System Model (RASM) and conducted coupled land‐atmosphere modeling for Alaska and Yukon River Basin at 4‐km grid spacing. In RASM, the land model was replaced with the Community Terrestrial Systems Model (CTSM) given its comprehensive process representations for cold regions. The microphysics schemes in the Weather Research and Forecast (WRF) atmospheric model were manually tuned for optimal model performance. This study aims to maintain good model performance for both hydroclimate and terrestrial hydrology, especially streamflow, which was rarely a priority in coupled models. Therefore, we implemented a strategy of iterative testing and optimization of CTSM. A multi‐decadal climate data set (1990–2021) was generated using RASM with optimized land parameters and manually tuned WRF microphysics. When evaluated against multiple observational data sets, this data set well captures the climate statistics and spatial distributions for five key weather variables and hydrologic fluxes, including precipitation, air temperature, snow fraction, evaporation‐to‐precipitation ratios, and streamflow. The simulated precipitation shows wet bias during the spring season and simulated air temperatures exhibit dampened seasonality with warm biases in winter and cold biases in summer. We used transfer entropy to investigate the discrepancy in connectivity of hydrologic and energy fluxes between the offline CTSM and coupled models, which contributed to their discrepancy in streamflow simulations.more » « less
-
Abstract The modeling of weather and climate has been a success story. The skill of forecasts continues to improve and model biases continue to decrease. Combining the output of multiple models has further improved forecast skill and reduced biases. But are we exploiting the full capacity of state-of-the-art models in making forecasts and projections? Supermodeling is a recent step forward in the multimodel ensemble approach. Instead of combining model output after the simulations are completed, in a supermodel individual models exchange state information as they run, influencing each other’s behavior. By learning the optimal parameters that determine how models influence each other based on past observations, model errors are reduced at an early stage before they propagate into larger scales and affect other regions and variables. The models synchronize on a common solution that through learning remains closer to the observed evolution. Effectively a new dynamical system has been created, a supermodel, that optimally combines the strengths of the constituent models. The supermodel approach has the potential to rapidly improve current state-of-the-art weather forecasts and climate predictions. In this paper we introduce supermodeling, demonstrate its potential in examples of various complexity, and discuss learning strategies. We conclude with a discussion of remaining challenges for a successful application of supermodeling in the context of state-of-the-art models. The supermodeling approach is not limited to the modeling of weather and climate, but can be applied to improve the prediction capabilities of any complex system, for which a set of different models exists.more » « less
An official website of the United States government
