skip to main content


This content will become publicly available on May 30, 2025

Title: Forecasting Ocean Waves off the U.S. East Coast Using an Ensemble Learning Approach
Abstract

This study introduces an ensemble learning model for the prediction of significant wave height and average wave period in stations along the U.S. Atlantic coast. The model utilizes the stacking method, combining three base learner models - Lasso regression, support vector machine, and Multi-layer Perceptron - to achieve more precise and robust predictions. To train and evaluate the models, a twenty-year dataset comprising meteorological and wave data was used, enabling forecasts for significant wave height and average wave period at 1, 3, 6, and 12 hour intervals. The data collection involved two NOAA buoy stations situated on the U.S. Atlantic coast. The findings demonstrate that the ensemble learning model constructed through the stacking method yields significantly higher accuracy in predicting significant wave height within the specified time intervals.

Moreover, the study investigates the influence of swell waves on forecasting significant wave height and average wave period. Notably, the inclusion of swell waves improves the accuracy of the 12-hour forecast. Consequently, the developed ensemble model effectively estimates both significant wave height and average wave period. The ensemble model outperforms the individual models in forecasting significant wave height and average wave period. This ensemble learning model serves as a viable alternative to conventional coastal models for predicting wave parameters.

 
more » « less
Award ID(s):
2019758 2223844
PAR ID:
10512627
Author(s) / Creator(s):
; ;
Publisher / Repository:
AMS Journals
Date Published:
Journal Name:
Artificial Intelligence for the Earth Systems
ISSN:
2769-7525
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as postprocessing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and 2-m temperature 2 weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multimodel approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability.

    Significance Statement

    Accurately forecasting temperature and precipitation on subseasonal time scales—2 weeks–2 months in advance—is extremely challenging. These forecasts would have immense value in agriculture, insurance, and economics. Our paper describes an application of machine learning techniques to improve forecasts of monthly average precipitation and 2-m temperature using lagged physics-based predictions and observational data 2 weeks in advance for the entire continental United States. For lagged ensembles, the proposed models outperform standard benchmarks such as historical averages and averages of physics-based predictions. Our findings suggest that utilizing the full set of physics-based predictions instead of the average enhances the accuracy of the final forecast.

     
    more » « less
  2. Extreme water levels (EWLs) resulting from tropical and extratropical cyclones pose significant risks to coastal communities and their interconnected ecosystems. To date, physically-based models have enabled accurate characterization of EWLs despite their inherent high computational cost. However, the applicability of these models is limited to data-rich sites with diverse morphologic and hydrodynamic characteristics. The dependence on high quality spatiotemporal data, which is often computationally expensive, hinders the applicability of these models to regions of either limited or data-scarce conditions. To address this challenge, we present a computationally efficient deep learning framework, employing Long Short-Term Memory (LSTM) networks, to predict the evolution of EWLs beyond site-specific training stations. The framework, named LSTM-Station Approximated Models (LSTM-SAM), consists of a collection of bidirectional LSTM models enhanced with a custom attention layer mechanism embedded in the model architecture. Moreover, the LSTM-SAM framework incorporates a transfer learning approach that is applicable to target (tide-gage) stations along the U.S. Atlantic Coast. The LSTM-SAM framework demonstrates satisfactory performance with “transferable” models achieving average Kling-Gupta Efficiency (KGE), Nash-Sutcliffe Efficiency (NSE), and Root-Mean Square Error (RMSE) ranging from 0.78 to 0.92, 0.90 to 0.97, and 0.09 to 0.18 at the target stations, respectively. Following these results, the LSTM-SAM framework can accurately predict not only EWLs but also their evolution over time, i.e., onset, peak, and dissipation, which could assist in large-scale operational flood forecasting, especially in regions with limited resources to set up high fidelity physically-based models. 
    more » « less
  3. Abstract Background

    Dynamical mathematical models defined by a system of differential equations are typically not easily accessible to non-experts. However, forecasts based on these types of models can help gain insights into the mechanisms driving the process and may outcompete simpler phenomenological growth models. Here we introduce a friendly toolbox,SpatialWavePredict, to characterize and forecast the spatial wave sub-epidemic model, which captures diverse wave dynamics by aggregating multiple asynchronous growth processes and has outperformed simpler phenomenological growth models in short-term forecasts of various infectious diseases outbreaks including SARS, Ebola, and the early waves of the COVID-19 pandemic in the US.

    Results

    This tutorial-based primer introduces and illustrates a user-friendly MATLAB toolbox for fitting and forecasting time-series trajectories using an ensemble spatial wave sub-epidemic model based on ordinary differential equations. Scientists, policymakers, and students can use the toolbox to conduct real-time short-term forecasts. The five-parameter epidemic wave model in the toolbox aggregates linked overlapping sub-epidemics and captures a rich spectrum of epidemic wave dynamics, including oscillatory wave behavior and plateaus. An ensemble strategy aims to improve forecasting performance by combining the resulting top-ranked models. The toolbox provides a tutorial for forecasting time-series trajectories, including the full uncertainty distribution derived through parametric bootstrapping, which is needed to construct prediction intervals and evaluate their accuracy. Functions are available to assess forecasting performance, estimation methods, error structures in the data, and forecasting horizons. The toolbox also includes functions to quantify forecasting performance using metrics that evaluate point and distributional forecasts, including the weighted interval score.

    Conclusions

    We have developed the first comprehensive toolbox to characterize and forecast time-series data using an ensemble spatial wave sub-epidemic wave model. As an epidemic situation or contagion occurs, the tools presented in this tutorial can facilitate policymakers to guide the implementation of containment strategies and assess the impact of control interventions. We demonstrate the functionality of the toolbox with examples, including a tutorial video, and is illustrated using daily data on the COVID-19 pandemic in the USA.

     
    more » « less
  4. Miguel Onorato (Ed.)

    The refraction of surface gravity waves by currents leads to spatial modulations in the wave field and, in particular, in the significant wave height. We examine this phenomenon in the case of waves scattered by a localised current feature, assuming (i) the smallness of the ratio between current velocity and wave group speed, and (ii) a swell-like, highly directional wave spectrum. We apply matched asymptotics to the equation governing the conservation of wave action in the four-dimensional position–wavenumber space. The resulting explicit formulas show that the modulations in wave action and significant wave height past the localised current are controlled by the vorticity of the current integrated along the primary direction of the swell. We assess the asymptotic predictions against numerical simulations using WAVEWATCH III for a Gaussian vortex. We also consider vortex dipoles to demonstrate the possibility of ‘vortex cloaking’ whereby certain currents have (asymptotically) no impact on the significant wave height. We discuss the role of the ratio of the two small parameters characterising assumptions (i) and (ii) above, and show that caustics are significant only for unrealistically large values of this ratio, corresponding to unrealistically narrow directional spectra.

     
    more » « less
  5. Abstract

    Mixing processes in the upper ocean play a key role in transferring heat, momentum, and matter in the ocean. These mixing processes are significantly enhanced by wave‐driven Langmuir turbulence (LT). Based on a paired analysis of observations and simulations, this study investigates wind fetch and direction effects on LT at a coastal site south of the island Martha’s Vineyard (MA, USA). Our results demonstrate that LT is strongly influenced by wind fetch and direction in coastal oceans, both of which contribute to controlling turbulent coastal transport processes. For northerly offshore winds, land limits the wind fetch and wave development, whereas southerly winds are associated with practically infinite fetch. Observed and simulated two‐dimensional wave height spectra reveal persistent southerly swell and substantially more developed wind‐driven waves from the south. For oblique offshore winds, waves develop more strongly in the alongshore direction with less limited fetch, resulting in significant wind and wave misalignments. Observations of coherent near‐surface crosswind velocities indicate that LT is only present for sufficiently developed waves. The fetch‐limited northerly winds inhibit wave developments and the formation of LT. In addition to limited fetch, strong wind–wave misalignments prevent LT development. Although energetic and persistent, swell waves do not substantially influence LT activity during the observation period because these relatively long swell waves are associated with small Stokes drift shear. These observational results agree well with turbulence‐resolving large eddy simulations (LESs) based on the wave‐averaged Navier–Stokes equation, validating the LES approach to coastal LT in the complex wind and wave conditions.

     
    more » « less