skip to main content


Title: Training machine learning with physics-based simulations to predict 2D soil moisture fields in a changing climate
The water content in the soil regulates exchanges between soil and atmosphere, impacts plant livelihood, and determines the antecedent condition for several natural hazards. Accurate soil moisture estimates are key to applications such as natural hazard prediction, agriculture, and water management. We explore how to best predict soil moisture at a high resolution in the context of a changing climate. Physics-based hydrological models are promising as they provide distributed soil moisture estimates and allow prediction outside the range of prior observations. This is particularly important considering that the climate is changing, and the available historical records are often too short to capture extreme events. Unfortunately, these models are extremely computationally expensive, which makes their use challenging, especially when dealing with strong uncertainties. These characteristics make them complementary to machine learning approaches, which rely on training data quality/quantity but are typically computationally efficient. We first demonstrate the ability of Convolutional Neural Networks (CNNs) to reproduce soil moisture fields simulated by the hydrological model ParFlow-CLM. Then, we show how these two approaches can be successfully combined to predict future droughts not seen in the historical timeseries. We do this by generating additional ParFlow-CLM simulations with altered forcing mimicking future drought scenarios. Comparing the performance of CNN models trained on historical forcing and CNN models trained also on simulations with altered forcing reveals the potential of combining these two approaches. The CNN can not only reproduce the moisture response to a given forcing but also learn and predict the impact of altered forcing. Given the uncertainties in projected climate change, we can create a limited number of representative ParFlow-CLM simulations (ca. 25 min/water year on 9 CPUs for our case study), train our CNNs, and use them to efficiently (seconds/water-year on 1 CPU) predict additional water years/scenarios and improve our understanding of future drought potential. This framework allows users to explore scenarios beyond past observation and tailor the training data to their application of interest (e.g., wet conditions for flooding, dry conditions for drought, etc…). With the trained ML model they can rely on high resolution soil moisture estimates and explore the impact of uncertainties.

 
more » « less
Award ID(s):
2134892
NSF-PAR ID:
10468497
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Frontiers in Water
Date Published:
Journal Name:
Frontiers in Water
Volume:
4
ISSN:
2624-9375
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    By mediating evapotranspiration processes, plant canopies play an important role in the terrestrial water cycle and regional climate. Substantial uncertainties exist in modeling canopy water interception and related hydrological processes due to rainfall forcing frequency selection and varying canopy traits. Here we design a new time interpolation method “zero” to better represent convective‐type precipitation in tropical regions. We also implement and recalibrate plant functional type‐specific interception parameters for rainforests and oil palm plantations, where oil palms express higher water interception capacity than forests, using the Community Land Model (CLM) versions 4.5 and 5.0 with CLM‐Palm embedded. Reconciling the interception scheme with realistic precipitation forcing produces more accurate canopy evaporation and transpiration for both plant functional types, which in turn improves simulated evapotranspiration and energy partitioning when benchmarked against observations from our study sites in Indonesia and an extensive literature review. Regional simulations for Sumatra and Kalimantan show that industrial oil palm plantations have 18–27% higher transpiration and 15–20% higher evapotranspiration than forests on an annual regional average basis across different ages or successional stages, even though the forests experience higher average precipitation according to reanalysis data. Our land‐only modeling results indicate that current oil palm plantations in Sumatra and Kalimantan use 15–20% more water (mean 220 mm or 20 Gt) per year compared to lowland rainforests of the same extent. The extra water use by oil palm reduces soil moisture and runoff that could affect ecosystem services such as productivity of staple crops and availability of drinking water in rural areas.

     
    more » « less
  2. Abstract

    Recent prolonged droughts and catastrophic wildfires in the western United States have raised concerns about the potential for forest mortality to impact forest structure, forest ecosystem services, and the economic vitality of communities in the coming decades. We used the Community Land Model (CLM) to determine forest vulnerability to mortality from drought and fire by the year 2049. We modified CLM to represent 13 major forest types in the western United States and ran simulations at a 4‐km grid resolution, driven with climate projections from two general circulation models under one emissions scenario (RCP 8.5). We developed metrics of vulnerability to short‐term extreme and prolonged drought based on annual allocation to stem growth and net primary productivity. We calculated fire vulnerability based on changes in simulated future area burned relative to historical area burned. Simulated historical drought vulnerability was medium to high in areas with observations of recent drought‐related mortality. Comparisons of observed and simulated historical area burned indicate simulated future fire vulnerability could be underestimated by 3% in the Sierra Nevada and overestimated by 3% in the Rocky Mountains. Projections show that water‐limited forests in the Rocky Mountains, Southwest, and Great Basin regions will be the most vulnerable to future drought‐related mortality, and vulnerability to future fire will be highest in the Sierra Nevada and portions of the Rocky Mountains. High carbon‐density forests in the Pacific coast and western Cascades regions are projected to be the least vulnerable to either drought or fire. Importantly, differences in climate projections lead to only 1% of the domain with conflicting low and high vulnerability to fire and no area with conflicting drought vulnerability. Our drought vulnerability metrics could be incorporated as probabilistic mortality rates in earth system models, enabling more robust estimates of the feedbacks between the land and atmosphere over the 21st century.

     
    more » « less
  3. Abstract

    More frequent and severe droughts are driving increased forest mortality around the globe. We urgently need to describe and predict how drought affects forest carbon cycling and identify thresholds of environmental stress that trigger ecosystem collapse. Quantifying the effects of drought at an ecosystem level is complex because dynamic climate–plant relationships can cause rapid and/or prolonged shifts in carbon balance. We employ the CARbon DAta MOdel fraMework (CARDAMOM) to investigate legacy effects of drought on forest carbon pools and fluxes. Our Bayesian model‐data fusion approach uses tower observed meteorological forcing and carbon fluxes to determine the response and sensitivity of aboveground and belowground ecological processes associated with the 2012–2015 California drought. Our study area is a mid‐montane mixed conifer forest in the Southern Sierras. CARDAMOM constrained with gross primary productivity (GPP) estimates covering 2011–2017 show a ~75% reduction in GPP, compared to negligible GPP change when constrained with 2011 only. Precipitation across 2012–2015 was 45% (474 mm) lower than the historical average and drove a cascading depletion in soil moisture and carbon pools (foliar, labile, roots, and litter). Adding 157 mm during an especially stressful year (2014, annual rainfall = 293 mm) led to a smaller depletion of water and carbon pools, steering the ecosystem away from a state of GPP tipping‐point collapse to recovery. We present novel process‐driven insights that demonstrate the sensitivity of GPP collapse to ecosystem foliar carbon and soil moisture states—showing that the full extent of GPP response takes several years to arise. Thus, long‐term changes in soil moisture and carbon pools can provide a mechanistic link between drought and forest mortality. Our study provides an example for how key precipitation threshold ranges can influence forest productivity, making them useful for monitoring and predicting forest mortality events.

     
    more » « less
  4. null (Ed.)
    Abstract. Plant activity in semi-arid ecosystems is largely controlled by pulses of precipitation, making them particularly vulnerable to increased aridity expected with climate change. Simple bucket-model hydrology schemes in land surface models (LSMs) have had limited ability in accurately capturing semi-arid water stores and fluxes. Recent, more complex, LSM hydrology models have not been widely evaluated against semi-arid ecosystem in situ data. We hypothesize that the failure of older LSM versions to represent evapotranspiration, ET, in arid lands is because simple bucket models do not capture realistic fluctuations in upper layer soil moisture. We therefore predict that including a discretized soil hydrology scheme based on a mechanistic description of moisture diffusion will result in an improvement in model ET when compared to data because the temporal variability of upper layer soil moisture content better corresponds to that of precipitation inputs. To test this prediction, we compared ORCHIDEE LSM simulations from (1) a simple conceptual 2-layer bucket scheme with fixed hydrological parameters; and (2) a 11-layer discretized mechanistic scheme of moisture diffusion in unsaturated soil based on Richards equations against daily and monthly soil moisture and ET observations, together with data-derived transpiration / evaporation, T / ET, ratios, from six semi-arid grass, shrub and forest sites in the southwestern USA. The 11-layer scheme also has modified calculations of surface runoff, bare soil evaporation, and water limitation to be compatible with the more complex hydrology configuration. To diagnose remaining discrepancies in the 11-layer model, we tested two further configurations: (i) the addition of a term that captures bare soil evaporation resistance to dry soil; and (ii) reduced bare soil fraction. We found that the more mechanistic 11-layer model results better representation of the daily and monthly ET observations. We show that is likely because of improved simulation of soil moisture in the upper layers of soil (top 5 cm). Some discrepancies between observed and modelled soil moisture and ET may allow us to prioritize future model development. Adding a soil resistance term generally decreased simulated E and increased soil moisture content, thus increasing T and T / ET ratios and reducing the negative T / ET model-data bias. By reducing the bare soil fraction in the model, we illustrated that modelled leaf T is too low at sparsely vegetated sites. We conclude that a discretized soil hydrology scheme and associated developments improves estimates of ET by allowing the model to more closely match the pulse precipitation dynamics of these semi-arid ecosystems; however, the partitioning of T from bare soil evaporation is not solved by this modification alone. 
    more » « less
  5. While machine learning approaches are rapidly being applied to hydrologic problems, physics-informed approaches are still relatively rare. Many successful deep-learning applications have focused on point estimates of streamflow trained on stream gauge observations over time. While these approaches show promise for some applications, there is a need for distributed approaches that can produce accurate two-dimensional results of model states, such as ponded water depth. Here, we demonstrate a 2D emulator of the Tilted V catchment benchmark problem with solutions provided by the integrated hydrology model ParFlow. This emulator model can use 2D Convolution Neural Network (CNN), 3D CNN, and U-Net machine learning architectures and produces time-dependent spatial maps of ponded water depth from which hydrographs and other hydrologic quantities of interest may be derived. A comparison of different deep learning architectures and hyperparameters is presented with particular focus on approaches such as 3D CNN (that have a time-dependent learning component) and 2D CNN and U-Net approaches (that use only the current model state to predict the next state in time). In addition to testing model performance, we also use a simplified simulation based inference approach to evaluate the ability to calibrate the emulator to randomly selected simulations and the match between ML calibrated input parameters and underlying physics-based simulation. 
    more » « less