skip to main content


Title: Graph-Guided Regularized Regression of Pacific Ocean Climate Variables to Increase Predictive Skill of Southwestern U.S. Winter Precipitation
Abstract Understanding the physical drivers of seasonal hydroclimatic variability and improving predictive skill remains a challenge with important socioeconomic and environmental implications for many regions around the world. Physics-based deterministic models show limited ability to predict precipitation as the lead time increases, due to imperfect representation of physical processes and incomplete knowledge of initial conditions. Similarly, statistical methods drawing upon established climate teleconnections have low prediction skill due to the complex nature of the climate system. Recently, promising data-driven approaches have been proposed, but they often suffer from overparameterization and overfitting due to the short observational record, and they often do not account for spatiotemporal dependencies among covariates (i.e., predictors such as sea surface temperatures). This study addresses these challenges via a predictive model based on a graph-guided regularizer that simultaneously promotes similarity of predictive weights for highly correlated covariates and enforces sparsity in the covariate domain. This approach both decreases the effective dimensionality of the problem and identifies the most predictive features without specifying them a priori. We use large ensemble simulations from a climate model to construct this regularizer, reducing the structural uncertainty in the estimation. We apply the learned model to predict winter precipitation in the southwestern United States using sea surface temperatures over the entire Pacific basin, and demonstrate its superiority compared to other regularization approaches and statistical models informed by known teleconnections. Our results highlight the potential to combine optimally the space–time structure of predictor variables learned from climate models with new graph-based regularizers to improve seasonal prediction.  more » « less
Award ID(s):
1928724 1839441 1839336
NSF-PAR ID:
10209944
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Journal of Climate
Volume:
34
Issue:
2
ISSN:
0894-8755
Page Range / eLocation ID:
737 to 754
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Climate variability has distinct spatial patterns with the strongest signal of sea surface temperature (SST) variance residing in the tropical Pacific. This interannual climate phenomenon, the El Niño-Southern Oscillation (ENSO), impacts weather patterns across the globe via atmospheric teleconnections. Pronounced SST variability, albeit of smaller amplitude, also exists in the other tropical basins as well as in the extratropical regions. To improve our physical understanding of internal climate variability across the global oceans, we here make the case for a conceptual model hierarchy that captures the essence of observed SST variability from subseasonal to decadal timescales. The building blocks consist of the classic stochastic climate model formulated by Klaus Hasselmann, a deterministic low-order model for ENSO variability, and the effect of the seasonal cycle on both of these models. This model hierarchy allows us to trace the impacts of seasonal processes on the statistics of observed and simulated climate variability. One of the important outcomes of ENSO’s interaction with the seasonal cycle is the generation of a frequency cascade leading to deterministic climate variability on a wide range of timescales, including the near-annual ENSO Combination Mode. Using the aforementioned building blocks, we arrive at a succinct conceptual model that delineates ENSO’s ubiquitous climate impacts and allows us to revisit ENSO’s observed statistical relationships with other coherent spatio-temporal patterns of climate variability—so called empirical modes of variability. We demonstrate the importance of correctly accounting for different seasonal phasing in the linear growth/damping rates of different climate phenomena, as well as the seasonal phasing of ENSO teleconnections and of atmospheric noise forcings. We discuss how previously some of ENSO’s relationships with other modes of variability have been misinterpreted due to non-intuitive seasonal cycle effects on both power spectra and lead/lag correlations. Furthermore, it is evident that ENSO’s impacts on climate variability outside the tropical Pacific are oftentimes larger than previously recognized and that accurately accounting for them has important implications. For instance, it has been shown that improved seasonal prediction skill can be achieved in the Indian Ocean by fully accounting for ENSO’s seasonally modulated and temporally integrated remote impacts. These results move us to refocus our attention to the tropical Pacific for understanding global patterns of climate variability and their predictability.

     
    more » « less
  2. Abstract Subseasonal-to-seasonal (S2S) precipitation prediction in boreal spring and summer months, which contains a significant number of high-signal events, is scientifically challenging and prediction skill has remained poor for years. Tibetan Plateau (TP) spring observed surface ­temperatures show a lag correlation with summer precipitation in several remote regions, but current global land–atmosphere coupled models are unable to represent this behavior due to significant errors in producing observed TP surface temperatures. To address these issues, the Global Energy and Water Exchanges (GEWEX) program launched the “Impact of Initialized Land Temperature and Snowpack on Subseasonal-to-Seasonal Prediction” (LS4P) initiative as a community effort to test the impact of land temperature in high-mountain regions on S2S prediction by climate models: more than 40 institutions worldwide are participating in this project. After using an innovative new land state initialization approach based on observed surface 2-m temperature over the TP in the LS4P experiment, results from a multimodel ensemble provide evidence for a causal relationship in the observed association between the Plateau spring land temperature and summer precipitation over several regions across the world through teleconnections. The influence is underscored by an out-of-phase oscillation between the TP and Rocky Mountain surface temperatures. This study reveals for the first time that high-mountain land temperature could be a substantial source of S2S precipitation predictability, and its effect is probably as large as ocean surface temperature over global “hotspot” regions identified here; the ensemble means in some “hotspots” produce more than 40% of the observed anomalies. This LS4P approach should stimulate more follow-on explorations. 
    more » « less
  3. Abstract

    While most spatial data can be modeled with the assumption that distant points are uncorrelated, some problems require dependence at both far and short distances. We introduce a model to directly incorporate dependence in phenomena that influence a distant response. Spatial climate problems often have such modeling needs as data are influenced by local factors in addition to remote phenomena, known as teleconnections. Teleconnections arise from complex interactions between the atmosphere and ocean, of which the El Niño–Southern Oscillation teleconnection is a well‐known example. Our model extends the standard geostatistical modeling framework to account for effects of covariates observed on a spatially remote domain. We frame our model as an extension of spatially varying coefficient models. Connections to existing methods are highlighted, and further modeling needs are addressed by additionally drawing on spatial basis functions and predictive processes. Notably, our approach allows users to model teleconnected data without prespecifying teleconnection indices, which other methods often require. We adopt a hierarchical Bayesian framework to conduct inference and make predictions. The method is demonstrated by predicting precipitation in Colorado while accounting for local factors and teleconnection effects with Pacific Ocean sea surface temperatures. We show how the proposed model improves upon standard methods for estimating teleconnection effects and discuss its utility for climate applications.

     
    more » « less
  4. Abstract We assess to what extent seven state-of-the-art dynamical prediction systems can retrospectively predict winter sea surface temperature (SST) in the subpolar North Atlantic and the Nordic seas in the period 1970–2005. We focus on the region where warm water flows poleward (i.e., the Atlantic water pathway to the Arctic) and on interannual-to-decadal time scales. Observational studies demonstrate predictability several years in advance in this region, but we find that SST skill is low with significant skill only at a lead time of 1–2 years. To better understand why the prediction systems have predictive skill or lack thereof, we assess the skill of the systems to reproduce a spatiotemporal SST pattern based on observations. The physical mechanism underlying this pattern is a propagation of oceanic anomalies from low to high latitudes along the major currents, the North Atlantic Current and the Norwegian Atlantic Current. We find that the prediction systems have difficulties in reproducing this pattern. To identify whether the misrepresentation is due to incorrect model physics, we assess the respective uninitialized historical simulations. These simulations also tend to misrepresent the spatiotemporal SST pattern, indicating that the physical mechanism is not properly simulated. However, the representation of the pattern is slightly degraded in the predictions compared to historical runs, which could be a result of initialization shocks and forecast drift effects. Ways to enhance predictions could include improved initialization and better simulation of poleward circulation of anomalies. This might require model resolutions in which flow over complex bathymetry and the physics of mesoscale ocean eddies and their interactions with the atmosphere are resolved. Significance Statement In this study, we find that dynamical prediction systems and their respective climate models struggle to realistically represent ocean surface temperature variability in the eastern subpolar North Atlantic and Nordic seas on interannual-to-decadal time scales. In previous studies, ocean advection is proposed as a key mechanism in propagating temperature anomalies along the Atlantic water pathway toward the Arctic Ocean. Our analysis suggests that the predicted temperature anomalies are not properly circulated to the north; this is a result of model errors that seems to be exacerbated by the effect of initialization shocks and forecast drift. Better climate predictions in the study region will thus require improving the initialization step, as well as enhancing process representation in the climate models. 
    more » « less
  5. Abstract

    Heatwaves are extreme near-surface temperature events that can have substantial impacts on ecosystems and society. Early warning systems help to reduce these impacts by helping communities prepare for hazardous climate-related events. However, state-of-the-art prediction systems can often not make accurate forecasts of heatwaves more than two weeks in advance, which are required for advance warnings. We therefore investigate the potential of statistical and machine learning methods to understand and predict central European summer heatwaves on time scales of several weeks. As a first step, we identify the most important regional atmospheric and surface predictors based on previous studies and supported by a correlation analysis: 2-m air temperature, 500-hPa geopotential, precipitation, and soil moisture in central Europe, as well as Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream. Based on these predictors, we apply machine learning methods to forecast two targets: summer temperature anomalies and the probability of heatwaves for 1–6 weeks lead time at weekly resolution. For each of these two target variables, we use both a linear and a random forest model. The performance of these statistical models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. For lead times longer than two weeks, our machine learning models compete with the ensemble mean of the European Centre for Medium-Range Weather Forecast’s hindcast system. We thus show that machine learning can help improve subseasonal forecasts of summer temperature anomalies and heatwaves.

    Significance Statement

    Heatwaves (prolonged extremely warm temperatures) cause thousands of fatalities worldwide each year. These damaging events are becoming even more severe with climate change. This study aims to improve advance predictions of summer heatwaves in central Europe by using statistical and machine learning methods. Machine learning models are shown to compete with conventional physics-based models for forecasting heatwaves more than two weeks in advance. These early warnings can be used to activate effective and timely response plans targeting vulnerable communities and regions, thereby reducing the damage caused by heatwaves.

     
    more » « less