skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Near-Real-Time Forecast of Satellite-Based Soil Moisture Using Long Short-Term Memory with an Adaptive Data Integration Kernel
Nowcasts, or near-real-time (NRT) forecasts, of soil moisture based on the Soil Moisture Active and Passive (SMAP) mission could provide substantial value for a range of applications including hazards monitoring and agricultural planning. To provide such a NRT forecast with high fidelity, we enhanced a time series deep learning architecture, long short-term memory (LSTM), with a novel data integration (DI) kernel to assimilate the most recent SMAP observations as soon as they become available. The kernel is adaptive in that it can accommodate irregular observational schedules. Testing over the CONUS, this NRT forecast product showcases predictions with unprecedented accuracy when evaluated against subsequent SMAP retrievals. It showed smaller error than NRT forecasts reported in the literature, especially at longer forecast latency. The comparative advantage was due to LSTM’s structural improvements, as well as its ability to utilize more input variables and more training data. The DI-LSTM was compared to the original LSTM model that runs without data integration, referred to as the projection model here. We found that the DI procedure removed the autocorrelated effects of forcing errors and errors due to processes not represented in the inputs, for example, irrigation and floodplain/lake inundation, as well as mismatches due to unseen forcing conditions. The effects of this purely data-driven DI kernel are discussed for the first time in the geosciences. Furthermore, this work presents an upper-bound estimate for the random component of the SMAP retrieval error.  more » « less
Award ID(s):
1832294
PAR ID:
10137627
Author(s) / Creator(s):
 ;  
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Journal of Hydrometeorology
Volume:
21
Issue:
3
ISSN:
1525-755X
Page Range / eLocation ID:
p. 399-413
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Soil Moisture Active Passive (SMAP) mission measures important soil moisture data globally. SMAP's products might not always perform better than land surface models (LSM) when evaluated against in situ measurements. However, we hypothesize that SMAP presents added value for long-term soil moisture estimation in a data fusion setting as evaluated by in situ data. Here, with the help of a time series deep learning (DL) method, we created a seamlessly extended SMAP data set to test this hypothesis and, importantly, gauge whether such benefits extend to years beyond SMAP's limited lifespan. We first show that the DL model, called long short-term memory (LSTM), can extrapolate SMAP for several years and the results are similar to the training period. We obtained prolongation results with low-performance degradation where SMAP itself matches well with in situ data. Interannual trends of root-zone soil moisture are surprisingly well captured by LSTM. In some cases, LSTM's performance is limited by SMAP, whose main issue appears to be its shallow sensing depth. Despite this limitation, a simple average between LSTM and an LSM Noah frequently outperforms Noah alone. Moreover, Noah combined with LSTM is more skillful than when it is combined with another LSM. Over sparsely instrumented sites, the Noah-LSTM combination shows a stronger edge. Our results verified the value of LSTM-extended SMAP data. Moreover, DL is completely data driven and does not require structural assumptions. As such, it has its unique potential for long-term projections and may be applied synergistically with other model-data integration techniques. 
    more » « less
  2. Abstract Recent observations with varied schedules and types (moving average, snapshot, or regularly spaced) can help to improve streamflow forecasts, but it is challenging to integrate them effectively. Based on a long short‐term memory (LSTM) streamflow model, we tested multiple versions of a flexible procedure we call data integration (DI) to leverage recent discharge measurements to improve forecasts. DI accepts lagged inputs either directly or through a convolutional neural network unit. DI ubiquitously elevated streamflow forecast performance to unseen levels, reaching a record continental‐scale median Nash‐Sutcliffe Efficiency coefficient value of 0.86. Integrating moving‐average discharge, discharge from the last few days, or even average discharge from the previous calendar month could all improve daily forecasts. Directly using lagged observations as inputs was comparable in performance to using the convolutional neural network unit. Importantly, we obtained valuable insights regarding hydrologic processes impacting LSTM and DI performance. Before applying DI, the base LSTM model worked well in mountainous or snow‐dominated regions, but less well in regions with low discharge volumes (due to either low precipitation or high precipitation‐energy synchronicity) and large interannual storage variability. DI was most beneficial in regions with high flow autocorrelation: it greatly reduced baseflow bias in groundwater‐dominated western basins and also improved peak prediction for basins with dynamical surface water storage, such as the Prairie Potholes or Great Lakes regions. However, even DI cannot elevate performance in high‐aridity basins with 1‐day flash peaks. Despite this limitation, there is much promise for a deep‐learning‐based forecast paradigm due to its performance, automation, efficiency, and flexibility. 
    more » « less
  3. Many coastal cities are facing frequent flooding from storm events that are made worse by sea level rise and climate change. The groundwater table level in these low relief coastal cities is an important, but often overlooked, factor in the recurrent flooding these locations face. Infiltration of stormwater and water intrusion due to tidal forcing can cause already shallow groundwater tables to quickly rise toward the land surface. This decreases available storage which increases runoff, stormwater system loads, and flooding. Groundwater table forecasts, which could help inform the modeling and management of coastal flooding, are generally unavailable. This study explores two machine learning models, Long Short-term Memory (LSTM) networks and Recurrent Neural Networks (RNN), to model and forecast groundwater table response to storm events in the flood prone coastal city of Norfolk, Virginia. To determine the effect of training data type on model accuracy, two types of datasets (i) the continuous time series and (ii) a dataset of only storm events, created from observed groundwater table, rainfall, and sea level data from 2010–2018 are used to train and test the models. Additionally, a real-time groundwater table forecasting scenario was carried out to compare the models’ abilities to predict groundwater table levels given forecast rainfall and sea level as input data. When modeling the groundwater table with observed data, LSTM networks were found to have more predictive skill than RNNs (root mean squared error (RMSE) of 0.09 m versus 0.14 m, respectively). The real-time forecast scenario showed that models trained only on storm event data outperformed models trained on the continuous time series data (RMSE of 0.07 m versus 0.66 m, respectively) and that LSTM outperformed RNN models. Because models trained with the continuous time series data had much higher RMSE values, they were not suitable for predicting the groundwater table in the real-time scenario when using forecast input data. These results demonstrate the first use of LSTM networks to create hourly forecasts of groundwater table in a coastal city and show they are well suited for creating operational forecasts in real-time. As groundwater table levels increase due to sea level rise, forecasts of groundwater table will become an increasingly valuable part of coastal flood modeling and management. 
    more » « less
  4. Abstract Probabilistic forecasts of changes in soil moisture and an Evaporative Stress Index (ESI) on sub-seasonal time scales over the contiguous U.S. are developed. The forecasts use the current land surface conditions and numerical weather prediction forecasts from the Sub-seasonal to Seasonal (S2S) Prediction Project. Changes in soil moisture are quite predictable 8-14 days in advance with 50% or more of the variance explained over the majority of the contiguous U.S.; however, changes in ESI are significantly less predictable. A simple red noise model of predictability shows that the spatial variations in forecast skill are primarily a result of variations in the autocorrelation, or persistence, of the predicted variable, especially for the ESI. The difference in overall skill between soil moisture and ESI, on the other hand, is due to the greater soil moisture predictability by the numerical model forecasts. As the forecast lead time increases from 8-14 days to 15-28 days, however, the autocorrelation dominates the soil moisture and ESI differences as well. An analysis of modelled transpiration, and bare soil and canopy water evaporation contributions to total evaporation, suggests improvements to the ESI forecasts can be achieved by estimating the relative contributions of these components to the initial ESI state. The importance of probabilistic forecasts for reproducing the correct probability of anomaly intensification is also shown. 
    more » « less
  5. Research in different agricultural sectors, including in crop loss estimation during flood and yield estimation, substantially rely on inundation information. Spaceborne remote sensing has widely been used in the mapping and monitoring of floods. However, the inability of optical remote sensing to cloud penetration and the scarcity of fine temporal resolution SAR data hinder the application of flood mapping in many cases. Soil Moisture Active Passive (SMAP) level 4 products, which are model-driven soil moisture data derived from SMAP observations and are available at 3-h intervals, can offer an intermediate but effective solution. This study maps flood progress in croplands by incorporating SMAP surface soil moisture, soil physical properties, and national floodplain information. Soil moisture above the effective soil porosity is a direct indication of soil saturation. Soil moisture also increases considerably during a flood event. Therefore, this approach took into account three conditions to map the flooded pixels: a minimum of 0.05 m3m−3 increment in soil moisture from pre-flood to post-flood condition, soil moisture above the effective soil porosity, and the holding of saturation condition for the 72 consecutive hours. Results indicated that the SMAP-derived maps were able to successfully map most of the flooded areas in the reference maps in the majority of the cases, though with some degree of overestimation (due to the coarse spatial resolution of SMAP). Finally, the inundated croplands are extracted from saturated areas by Spatial Hazard Zone areas (SHFA) of Federal Emergency Management Agency (FEMA) and cropland data layer (CDL). The flood maps extracted from SMAP data are validated with FEMA-declared affected counties as well as with flood maps from other sources. 
    more » « less