skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Recommendations for Comprehensive and Independent Evaluation of Machine Learning‐Based Earth System Models
Abstract Machine learning (ML) is a revolutionary technology with demonstrable applications across multiple disciplines. Within the Earth science community, ML has been most visible for weather forecasting, producing forecasts that rival modern physics‐based models. Given the importance of deepening our understanding and improving predictions of the Earth system on all time scales, efforts are now underway to develop Earth‐system models (ESMs) capable of representing all components of the coupled Earth system (or their aggregated behavior) and their response to external changes over long timescales. Building trust in ESMs is a much more difficult problem than for weather forecast models, not least because the model must represent the alternate (e.g., future or paleoclimatic) coupled states of the system for which there are no direct observations. Given that the physical principles that enable predictions about the response of the Earth system are often not explicitly coded in these ML‐based models, demonstrating the credibility of ML‐based ESMs thus requires us to build evidence of their consistency with the physical system. To this end, this paper puts forward five recommendations to enhance comprehensive, standardized, and independent evaluation of ML‐based ESMs to strengthen their credibility and promote their wider use.  more » « less
Award ID(s):
2019625
PAR ID:
10577488
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Journal of Geophysical Research: Machine Learning and Computation
Volume:
2
Issue:
1
ISSN:
2993-5210
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Buchan, Alison (Ed.)
    ABSTRACT Climate change jeopardizes human health, global biodiversity, and sustainability of the biosphere. To make reliable predictions about climate change, scientists use Earth system models (ESMs) that integrate physical, chemical, and biological processes occurring on land, the oceans, and the atmosphere. Although critical for catalyzing coupled biogeochemical processes, microorganisms have traditionally been left out of ESMs. Here, we generate a “top 10” list of priorities, opportunities, and challenges for the explicit integration of microorganisms into ESMs. We discuss the need for coarse-graining microbial information into functionally relevant categories, as well as the capacity for microorganisms to rapidly evolve in response to climate-change drivers. Microbiologists are uniquely positioned to collect novel and valuable information necessary for next-generation ESMs, but this requires data harmonization and transdisciplinary collaboration to effectively guide adaptation strategies and mitigation policy. 
    more » « less
  2. Abstract Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch. 
    more » « less
  3. Abstract Global climate models (GCMs) and Earth system models (ESMs) exhibit biases, with resolutions too coarse to capture local variability for fine-scale, reliable drought and climate impact assessment. However, conventional bias correction approaches may cause implausible climate change signals due to unrealistic representations of spatial and intervariable dependences. While purely data-driven deep learning has achieved significant progress in improving climate and earth system simulations and predictions, they cannot reliably learn the circumstances (e.g., extremes) that are largely unseen in historical climate but likely becoming more frequent in the future climate (i.e., climate non-stationarity). This study shows an integrated trend-preserving deep learning approach that can address the spatial and intervariable dependences and climate non-stationarity issues for downscaling and bias correcting GCMs/ESMs. Here we combine the super-resolution deep residual network (SRDRN) with the trend-preserving quantile delta mapping (QDM) to downscale and bias correct six primary climate variables at once (including daily precipitation, maximum temperature, minimum temperature, relative humidity, solar radiation, and wind speed) from five state-of-the-art GCMs/ESMs in the Coupled Model Intercomparison Project Phase 6 (CMIP6). We found that the SRDRN-QDM approach greatly reduced GCMs/ESMs biases in spatial and intervariable dependences while significantly better-reducing biases in extremes compared to deep learning. The estimated drought based on the six bias-corrected and downscaled variables captured the observed drought intensity and frequency, which outperformed state-of-the-art multivariate bias correction approaches, demonstrating its capability for correcting GCMs/ESMs biases in spatial and multivariable dependences and extremes. 
    more » « less
  4. Abstract Atmospheric aerosol and chemistry modules are key elements in Earth system models (ESMs), as they predict air pollutant concentrations and properties that can impact human health, weather, and climate. The current uncertainty in climate projections is partly due to the inaccurate representation of aerosol direct and indirect forcing. Aerosol/chemistry parameterizations used within ESMs and other atmospheric models span large structural and parameter uncertainties that are difficult to assess independently of their host models. Moreover, there is a strong need for a standardized interface between aerosol/chemistry modules and the host model to facilitate portability of aerosol/chemistry parameterizations from one model to another, allowing not only a comparison between different parameterizations within the same modeling framework, but also quantifying the impact of different model frameworks on aerosol/chemistry predictions. To address this need, we have initiated a new community effort to coordinate the construction of a Generalized Aerosol/Chemistry Interface (GIANT) for use across weather and climate models. We aim to organize a series of community workshops and hackathons to design and build GIANT, which will serve as the interface between a range of aerosol/chemistry modules and the physics and dynamics components of atmospheric host models. GIANT will leverage ongoing efforts at the U.S. modeling centers focused on building next-generation ESMs and the international AeroCom initiative to implement this common aerosol/chemistry interface. GIANT will create transformative opportunities for scientists and students to conduct innovative research to better characterize structural and parametric uncertainties in aerosol/chemistry modules, and to develop a common set of aerosol/chemistry parameterizations. 
    more » « less
  5. Abstract Soil carbon (C) responses to environmental change represent a major source of uncertainty in the global C cycle. Feedbacks between soil C stocks and climate drivers could impact atmospheric CO2levels, further altering the climate. Here, we assessed the reliability of Earth system model (ESM) predictions of soil C change using the Coupled Model Intercomparison Project phases 5 and 6 (CMIP5 and CMIP6). ESMs predicted global soil C gains under the high emission scenario, with soils taking up 43.9 Pg (95% CI: 9.2–78.5 Pg) C on average during the 21st century. The variation in global soil C change declined significantly from CMIP5 (with average of 48.4 Pg [95% CI: 2.0–94.9 Pg] C) to CMIP6 models (with average of 39.3 Pg [95% CI: 23.9–54.7 Pg] C). For some models, a small C increase in all biomes contributed to this convergence. For other models, offsetting responses between cold and warm biomes contributed to convergence. Although soil C predictions appeared to converge in CMIP6, the dominant processes driving soil C change at global or biome scales differed among models and in many cases between earlier and later versions of the same model. Random Forest models, for soil carbon dynamics, accounted for more than 63% variation of the global soil C change predicted by CMIP5 ESMs, but only 36% for CMIP6 models. Although most CMIP6 models apparently agree on increased soil C storage during the 21st century, this consensus obscures substantial model disagreement on the mechanisms underlying soil C response, calling into question the reliability of model predictions. 
    more » « less