skip to main content


Title: Prescreening-Based Subset Selection for Improving Predictions of Earth System Models With Application to Regional Prediction of Red Tide
We present the ensemble method of prescreening-based subset selection to improve ensemble predictions of Earth system models (ESMs). In the prescreening step, the independent ensemble members are categorized based on their ability to reproduce physically-interpretable features of interest that are regional and problem-specific. The ensemble size is then updated by selecting the subsets that improve the performance of the ensemble prediction using decision relevant metrics. We apply the method to improve the prediction of red tide along the West Florida Shelf in the Gulf of Mexico, which affects coastal water quality and has substantial environmental and socioeconomic impacts on the State of Florida. Red tide is a common name for harmful algal blooms that occur worldwide, which result from large concentrations of aquatic microorganisms, such as dinoflagellate Karenia brevis , a toxic single celled protist. We present ensemble method for improving red tide prediction using the high resolution ESMs of the Coupled Model Intercomparison Project Phase 6 (CMIP6) and reanalysis data. The study results highlight the importance of prescreening-based subset selection with decision relevant metrics in identifying non-representative models, understanding their impact on ensemble prediction, and improving the ensemble prediction. These findings are pertinent to other regional environmental management applications and climate services. Additionally, our analysis follows the FAIR Guiding Principles for scientific data management and stewardship such that data and analysis tools are findable, accessible, interoperable, and reusable. As such, the interactive Colab notebooks developed for data analysis are annotated in the paper. This allows for efficient and transparent testing of the results’ sensitivity to different modeling assumptions. Moreover, this research serves as a starting point to build upon for red tide management, using the publicly available CMIP, Coordinated Regional Downscaling Experiment (CORDEX), and reanalysis data.  more » « less
Award ID(s):
1939994
PAR ID:
10379676
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Earth Science
Volume:
10
ISSN:
2296-6463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The rapid expansion of Earth system model (ESM) data available from the Coupled Model Intercomparison Project Phase 6 (CMIP6) necessitates new methods to evaluate the performance and suitability of ESMs used for hydroclimate applications as these extremely large data volumes complicate stakeholder efforts to use new ESM outputs in updated climate vulnerability and impact assessments. We develop an analysis framework to inform ESM sub‐selection based on process‐oriented considerations and demonstrate its performance for a regional application in the US Pacific Northwest. First, a suite of global and regional metrics is calculated, using multiple historical observation datasets to assess ESM performance. These metrics are then used to rank CMIP6 models, and a culled ensemble of models is selected using a trend‐related diagnostics approach. This culling strategy does not dramatically change climate scenario trend projections in this region, despite retaining only 20% of the CMIP6 ESMs in the final model ensemble. The reliability of the culled trend projection envelope and model response similarity is also assessed using a perfect model framework. The absolute difference in temperature trend projections is reduced relative to the full ensemble compared to the model for each SSP scenario, while precipitation trend errors are largely unaffected. In addition, we find that the spread of the culled ensemble temperature and precipitation trends includes the trend of the “truth” model ∼83%‐92% of the time. This analysis demonstrates a reliable method to reduce ESM ensemble size that can ease use of ESMs for creating and understanding climate vulnerability and impact assessments.

     
    more » « less
  2. This second consensus document builds on the first, providing updates on actions to address the initial recommendations and identifying additional actions that will advance management of red tide. The HAB Task Force continues to recommend actions that create improved understanding of red tide and translate it into enhanced management. Like its predecessor, this document is not intended to provide an exhaustive list of useful actions. The recommendations are meant to complement and support other efforts to set long-term goals and implement specific actions that minimize the harmful effects of red tide as well as a variety of other HABs that impact Florida, such as the work of the Blue-Green Algae Task Force. 
    more » « less
  3. null (Ed.)
    Abstract The California Current System (CCS) sustains economically valuable fisheries and is particularly vulnerable to ocean acidification, due to its natural upwelling of carbon-enriched waters that generate corrosive conditions for local ecosystems. Here we use a novel suite of retrospective, initialized ensemble forecasts with an Earth system model (ESM) to predict the evolution of surface pH anomalies in the CCS. We show that the forecast system skillfully predicts observed surface pH variations a year in advance over a naive forecasting method, with the potential for skillful prediction up to five years in advance. Skillful predictions of surface pH are mainly derived from the initialization of dissolved inorganic carbon anomalies that are subsequently transported into the CCS. Our results demonstrate the potential for ESMs to provide skillful predictions of ocean acidification on large scales in the CCS. Initialized ESMs could also provide boundary conditions to improve high-resolution regional forecasting systems. 
    more » « less
  4. Machine learning algorithms are often used to model and predict animal habitat selection—the relationships between animal occurrences and habitat characteristics. For broadly distributed species, habitat selection often varies among populations and regions; thus, it would seem preferable to fit region- or population-specific models of habitat selection for more accurate inference and prediction, rather than fitting large-scale models using pooled data. However, where the aim is to make range-wide predictions, including areas for which there are no existing data or models of habitat selection, how can regional models best be combined? We propose that ensemble approaches commonly used to combine different algorithms for a single region can be reframed, treating regional habitat selection models as the candidate models. By doing so, we can incorporate regional variation when fitting predictive models of animal habitat selection across large ranges. We test this approach using satellite telemetry data from 168 humpback whales across five geographic regions in the Southern Ocean. Using random forests, we fitted a large-scale model relating humpback whale locations, versus background locations, to 10 environmental covariates, and made a circumpolar prediction of humpback whale habitat selection. We also fitted five regional models, the predictions of which we used as input features for four ensemble approaches: an unweighted ensemble, an ensemble weighted by environmental similarity in each cell, stacked generalization, and a hybrid approach wherein the environmental covariates and regional predictions were used as input features in a new model. We tested the predictive performance of these approaches on an independent validation dataset of humpback whale sightings and whaling catches. These multiregional ensemble approaches resulted in models with higher predictive performance than the circumpolar naive model. These approaches can be used to incorporate regional variation in animal habitat selection when fitting range-wide predictive models using machine learning algorithms. This can yield more accurate predictions across regions or populations of animals that may show variation in habitat selection. 
    more » « less
  5. Abstract

    This paper analyzes findings from semistructured interviews and focus groups with 31 farmers in the Willamette Valley in which farmers were asked about their needs for climate data and about the usability of a range of outputs from the Community Earth System Model, version 2 (CESM2), for their soil management practices. Findings indicate that climate and soils data generated from CESM and other Earth system models (ESMs), despite their coarse spatial scale resolutions, can inform farmers’ long-term decisions, but that the data would be more usable if the outputs were provided in a format that allowed farmers to choose the variables and thresholds relevant to their particular needs and if ESMs incorporated farmer practices including residue removal, cover cropping, and tillage levels into the model operations so that farmers could better understand the impacts of their decisions. Findings also suggest that although there is a significant gap in the spatial resolution at which these global ESMs generate data and the spatial resolution needed by farmers to make most decisions, farmers are adept at making scalar adjustments to apply coarse-resolution data to the specifics of their own farm’s microclimate. Thus, our findings suggest that, to support agricultural decision-making, development priorities for ESMs should include developing better representations of agricultural management practices within the models and creating interactive data dashboards or platforms.

     
    more » « less