skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Regional Weather Variable Predictions by Machine Learning With Near-Surface Observational and Atmospheric Numerical Data
Accurate and timely regional weather prediction is vital for sectors dependent on weather-related decisions. Traditional prediction methods, based on atmospheric equations, often struggle with coarse temporal resolutions and inaccuracies. This article presents a novel machine learning (ML) model, called Micro–Macro (MiMa), that integrates both near-surface obser- vational data from Kentucky Mesonet stations (collected every 5 min, known as Micro data) and hourly atmospheric numerical outputs (termed as Macro data) for fine-resolution weather forecasting. The MiMa model employs an encoder–decoder trans- former structure, with two encoders for processing multivariate data from both datasets and a decoder for forecasting weather variables over short time horizons. Each instance of the MiMa model, called a modelet, predicts the values of a specific weather parameter at an individual mesonet station. The approach is extended with Regional MiMa (Re-MiMa) modelets, which are designed to predict weather variables at ungauged locations by training on multivariate data from a few representative stations in a region, tagged with their elevations. Re-MiMa can provide highly accurate predictions across an entire region, even in areas without observational stations. Experimental results show that MiMa significantly outperforms current models, with Re-MiMa offering precise short-term forecasts for ungauged locations, marking a significant advancement in weather forecasting accu- racy and applicability.  more » « less
Award ID(s):
2019511
PAR ID:
10648794
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS)
Date Published:
Journal Name:
IEEE Transactions on Geoscience and Remote Sensing
Volume:
63
ISSN:
0196-2892
Page Range / eLocation ID:
1 to 21
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The Maryland Mesonet project will construct a network of 75 surface observing stations with aims that include mitigating the statewide impact of severe convective storms and improving analyses of records. The spatial configuration of mesonet stations is expected to affect the utility newly provided observations will have via data assimilation, making it desirable to study the effects of mesonet configuration. Furthermore, the impact associated with any observing system configuration is constrained by errors inherent to the prediction systems used to generate forecasts, which may change with future advances in data assimilation methodology, physical parameterization schemes, and resource availability. To address such possibilities, we perform sets of observing system simulation experiments using a high-resolution regional modeling system to assess the expected impact of four candidate mesonet configurations. Experiments cover seven 18-h case study events featuring moist convective regimes associated with severe weather over the state of Maryland and are performed using two versions of our experimental modeling system: a “standard-uncertainty” configuration tuned to be representative of existing convective-allowing prediction systems and a “constrained-uncertainty” configuration with reduced boundary condition and model error that reflects a possible trajectory for future prediction systems. We find that the assimilation of mesonet data produces definitive improvements to analysis fields below 1000 m that are mediated by modeling system uncertainty. Conversely, mesonet impact on forecast verification is inconclusive and strongly variable across verification metrics. The impact of mesonet configuration appears limited by a saturation effect that caps local analysis improvements past a minimal density of observing stations. Significance StatementThe Maryland Mesonet project will construct 75 surface observing stations to improve the analysis of records for Maryland’s surface weather conditions as well as predictions for severe weather events. The spatial placement of sensors is expected to influence the utility of a mesonet, making it desirable to optimize mesonet layouts. The utility provided by a mesonet may also be impacted by errors in prediction systems used to generate analyses and forecasts, which are themselves subject to change given future advances in prediction methods and resources. This study uses observing system simulation experiments (OSSEs)—which comprehensively simulate numerical weather prediction for a known “truth state” —to characterize improvement we may expect from mesonet observations and evaluate four potential mesonet configurations. 
    more » « less
  2. Abstract Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models, if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this article, we propose a convolutional variational autoencoder (VAE)-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a two-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder–decoder-based convolutional model in terms of short-term skills, while remaining stable for long-term climate simulations yielding accurate climatology. 
    more » « less
  3. null (Ed.)
    Abstract Some of the most intense convective storms on Earth initiate near the Sierras de Córdoba mountain range in Argentina. The goal of the RELAMPAGO field campaign was to observe these intense convective storms and their associated impacts. The intense observation period (IOP) occurred during November–December 2018. The two goals of the hydrometeorological component of RELAMPAGO IOP were 1) to perform hydrological streamflow and meteorological observations in previously ungauged basins and 2) to build a hydrometeorological modeling system for hindcast and forecast applications. During the IOP, our team was able to construct the stage–discharge curves in three basins, as hydrological instrumentation and personnel were successfully deployed based on RELAMPAGO weather forecasts. We found that the flood response time in these river locations is typically between 5 and 6 h from the peak of the rain event. The satellite-observed rainfall product IMERG-Final showed a better representation of rain gauge–estimated precipitation, while IMERG-Early and IMERG-Late had significant positive bias. The modeling component focuses on the 48-h simulation of an extreme hydrometeorological event that occurred on 27 November 2018. Using the Weather Research and Forecasting (WRF) atmospheric model and its hydrologic component WRF-Hydro as an uncoupled hydrologic model, we developed a system for hindcast, deterministic forecast, and a 60-member ensemble forecast initialized with regional-scale atmospheric data assimilation. Critically, our results highlight that streamflow simulations using the ensemble forecasting with data assimilation provide realistic flash flood forecast in terms of timing and magnitude of the peak. Our findings from this work are being used by the water managers in the region. 
    more » « less
  4. null (Ed.)
    One of the benefits of training a process-based, land surface model is the capacity to use it in ungauged sites as a complement to standard weather stations for predicting energy fluxes, evapotranspiration, and surface and root-zone soil temperature and moisture. In this study, dynamic (i.e., time-evolving) vegetation parameters were derived from remotely sensed Moderate Resolution Imaging Spectroradiometer (MODIS) imagery and coupled with a physics-based land surface model (tin-based Real-time Integrated Basin Simulator (tRIBS)) at four eddy covariance (EC) sites in south-central U.S. to test the predictability of micro-meteorological, soil-related, and energy flux-related variables. One cropland and one grassland EC site in northern Oklahoma, USA, were used to tune the model with respect to energy fluxes, soil temperature, and moisture. Calibrated model parameters, mostly related to the soil, were then transferred to two other EC sites in Oklahoma with similar soil and vegetation types. New dynamic vegetation parameter time series were updated according to MODIS imagery at each site. Overall, the tRIBS model captured both seasonal and diurnal cycles of the energy partitioning and soil temperatures across all four stations, as indicated by the model assessment metrics, although large uncertainties appeared in the prediction of ground heat flux, surface, and root-zone soil moisture at some stations. The transferability of previously calibrated model parameters and the use of MODIS to derive dynamic vegetation parameters enabled rapid yet reasonable predictions. The model was proven to be a convenient complement to standard weather stations particularly for sites where eddy covariance or similar equipment is not available. 
    more » « less
  5. Arctic amplification has altered the climate patterns both regionally and globally, resulting in more frequent and more intense extreme weather events in the past few decades. The essential part of Arctic amplification is the unprecedented sea ice loss as demonstrated by satellite observations. Accurately forecasting Arctic sea ice from sub-seasonal to seasonal scales has been a major research question with fundamental challenges at play. In addition to physics-based Earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven approaches to study sea ice variations, we propose MT-IceNet – a UNet-based spatial and multi-temporal (MT) deep learning model for forecasting Arctic sea ice concentration (SIC). The model uses an encoder-decoder architecture with skip connections and processes multi-temporal input streams to regenerate spatial maps at future timesteps. Using bi-monthly and monthly satellite retrieved sea ice data from NSIDC as well as atmospheric and oceanic variables from ERA5 reanalysis product during 1979-2021, we show that our proposed model provides promising predictive performance for per-pixel SIC forecasting with up to 60% decrease in prediction error for a lead time of 6 months as compared to its state-of-the-art counterparts. 
    more » « less