skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Week 3–4 Prediction of Wintertime CONUS Temperature Using Machine Learning Techniques
This paper shows that skillful week 3–4 predictions of a large-scale pattern of 2 m temperature over the US can be made based on the Nino3.4 index alone, where skillful is defined to be better than climatology. To find more skillful regression models, this paper explores various machine learning strategies (e.g., ridge regression and lasso), including those trained on observations and on climate model output. It is found that regression models trained on climate model output yield more skillful predictions than regression models trained on observations, presumably because of the larger training sample. Nevertheless, the skill of the best machine learning models are only modestly better than ordinary least squares based on the Nino3.4 index. Importantly, this fact is difficult to infer from the parameters of the machine learning model because very different parameter sets can produce virtually identical predictions. For this reason, attempts to interpret the source of predictability from the machine learning model can be very misleading. The skill of machine learning models also are compared to those of a fully coupled dynamical model, CFSv2. The results depend on the skill measure: for mean square error, the dynamical model is slightly worse than the machine learning models; for correlation skill, the dynamical model is only modestly better than machine learning models or the Nino3.4 index. In summary, the best predictions of the large-scale pattern come from machine learning models trained on long climate simulations, but the skill is only modestly better than predictions based on the Nino3.4 index alone.  more » « less
Award ID(s):
1822221
PAR ID:
10290456
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Frontiers in Climate
Volume:
3
ISSN:
2624-9553
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We investigate the predictability of the sign of daily southeastern U.S. (SEUS) precipitation anomalies associated with simultaneous predictors of large-scale climate variability using machine learning models. Models using index-based climate predictors and gridded fields of large-scale circulation as predictors are utilized. Logistic regression (LR) and fully connected neural networks using indices of climate phenomena as predictors produce neither accurate nor reliable predictions, indicating that the indices themselves are not good predictors. Using gridded fields as predictors, an LR and convolutional neural network (CNN) are more accurate than the index-based models. However, only the CNN can produce reliable predictions that can be used to identify forecasts of opportunity. Using explainable machine learning we identify which variables and grid points of the input fields are most relevant for confident and correct predictions in the CNN. Our results show that the local circulation is most important as represented by maximum relevance of 850-hPa geopotential heights and zonal winds to making skillful, high-probability predictions. Corresponding composite anomalies identify connections with El Niño–Southern Oscillation during winter and the Atlantic multidecadal oscillation and North Atlantic subtropical high during summer. 
    more » « less
  2. Abstract Few studies have utilized machine learning techniques to predict or understand the Madden‐Julian oscillation (MJO), a key source of subseasonal variability and predictability. Here, we present a simple framework for real‐time MJO prediction using shallow artificial neural networks (ANNs). We construct two ANN architectures, one deterministic and one probabilistic, that predict a real‐time MJO index using maps of tropical variables. These ANNs make skillful MJO predictions out to ∼18 days in October‐March and ∼11 days in April‐September, outperforming conventional linear models and efficiently capturing aspects of MJO predictability found in more complex, dynamical models. The flexibility and explainability of simple ANN frameworks are highlighted through varying model input and applying ANN explainability techniques that reveal sources and regions important for ANN prediction skill. The accessibility, performance, and efficiency of this simple machine learning framework is more broadly applicable to predict and understand other Earth system phenomena. 
    more » « less
  3. Abstract Despite recent progress in seasonal forecast systems, the predictive skill for the Indian Ocean Dipole (IOD) remains typically limited to a lead time of one season or less in both dynamical and empirical models. Here we develop a simple stochastic‐dynamical model (SDM) to predict the IOD using seasonally modulated El Niño–Southern Oscillation (ENSO) forcing together with a seasonally modulated Indian Ocean coupled ocean‐atmosphere feedback. The SDM, with either observed or forecasted ENSO forcing, exhibits generally higher skill and longer lead times for predicting IOD events than the operational Climate Forecast System version 2 and the Scale Interaction Experiment–Frontier system. The improvements mainly originate from better prediction of ENSO‐dependent IOD events and from reducing false alarms. These results affirm our hypothesis that operational IOD predictability beyond persistence is largely controlled by ENSO predictability and the signal‐to‐noise ratio of the system. Therefore, potential future ENSO improvements in models should translate to more skillful IOD predictions. 
    more » « less
  4. Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other regularization techniques that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise. 
    more » « less
  5. Abstract We assess to what extent seven state-of-the-art dynamical prediction systems can retrospectively predict winter sea surface temperature (SST) in the subpolar North Atlantic and the Nordic seas in the period 1970–2005. We focus on the region where warm water flows poleward (i.e., the Atlantic water pathway to the Arctic) and on interannual-to-decadal time scales. Observational studies demonstrate predictability several years in advance in this region, but we find that SST skill is low with significant skill only at a lead time of 1–2 years. To better understand why the prediction systems have predictive skill or lack thereof, we assess the skill of the systems to reproduce a spatiotemporal SST pattern based on observations. The physical mechanism underlying this pattern is a propagation of oceanic anomalies from low to high latitudes along the major currents, the North Atlantic Current and the Norwegian Atlantic Current. We find that the prediction systems have difficulties in reproducing this pattern. To identify whether the misrepresentation is due to incorrect model physics, we assess the respective uninitialized historical simulations. These simulations also tend to misrepresent the spatiotemporal SST pattern, indicating that the physical mechanism is not properly simulated. However, the representation of the pattern is slightly degraded in the predictions compared to historical runs, which could be a result of initialization shocks and forecast drift effects. Ways to enhance predictions could include improved initialization and better simulation of poleward circulation of anomalies. This might require model resolutions in which flow over complex bathymetry and the physics of mesoscale ocean eddies and their interactions with the atmosphere are resolved. Significance Statement In this study, we find that dynamical prediction systems and their respective climate models struggle to realistically represent ocean surface temperature variability in the eastern subpolar North Atlantic and Nordic seas on interannual-to-decadal time scales. In previous studies, ocean advection is proposed as a key mechanism in propagating temperature anomalies along the Atlantic water pathway toward the Arctic Ocean. Our analysis suggests that the predicted temperature anomalies are not properly circulated to the north; this is a result of model errors that seems to be exacerbated by the effect of initialization shocks and forecast drift. Better climate predictions in the study region will thus require improving the initialization step, as well as enhancing process representation in the climate models. 
    more » « less