skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 13 until 2:00 AM ET on Friday, June 14 due to maintenance. We apologize for the inconvenience.

Title: A machine learning approach to emulation and biophysical parameter estimation with the Community Land Model, version 5
Abstract. Land models are essential tools for understanding and predicting terrestrial processes and climate–carbon feedbacks in the Earth system, but uncertainties in their future projections are poorly understood. Improvements in physical process realism and the representation of human influence arguably make models more comparable to reality but also increase the degrees of freedom in model configuration, leading to increased parametric uncertainty in projections. In this work we design and implement a machine learning approach to globally calibrate a subset of the parameters of the Community Land Model, version 5 (CLM5) to observations of carbon and water fluxes. We focus on parameters controlling biophysical features such as surface energy balance, hydrology, and carbon uptake. We first use parameter sensitivity simulations and a combination of objective metrics including ranked global mean sensitivity to multiple output variables and non-overlapping spatial pattern responses between parameters to narrow the parameter space and determine a subset of important CLM5 biophysical parameters for further analysis. Using a perturbed parameter ensemble, we then train a series of artificial feed-forward neural networks to emulate CLM5 output given parameter values as input. We use annual mean globally aggregated spatial variability in carbon and water fluxes as our emulation and calibration targets. Validation and out-of-sample tests are used to assess the predictive skill of the networks, and we utilize permutation feature importance and partial dependence methods to better interpret the results. The trained networks are then used to estimate global optimal parameter values with greater computational efficiency than achieved by hand tuning efforts and increased spatial scale relative to previous studies optimizing at a single site. By developing this methodology, our framework can help quantify the contribution of parameter uncertainty to overall uncertainty in land model projections.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Advances in Statistical Climatology, Meteorology and Oceanography
Page Range / eLocation ID:
223 to 244
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Terrestrial biosphere models can help identify physical processes that control carbon dynamics, including land‐atmosphere CO2fluxes, and have the potential to project the terrestrial ecosystem response to changing climate. It is important to identify ecosystem processes most responsible for model predictive uncertainty and design improved model representation and observational system studies to reduce that uncertainty. Here we identified model parameters that contribute the most uncertainty to long‐term (~100 years) projections of net ecosystem exchange, net primary production, and aboveground biomass within a mechanistic terrestrial biosphere model (Ecosystem Demography, version 2.1) ED2. An uncertainty analysis identified parameters that represent the quantum efficiency of light to photosynthetic conversion, leaf respiration and soil‐plant water transfer as the highest contributors to model uncertainty regardless of time frame (annual, decadal, and centennial) and output (e.g., net ecosystem exchange, net primary production, aboveground biomass). Contrary to expectations, the contribution of successional processes related to reproduction, competition, and mortality did not increase as the time scale increased. These findings suggest that uncertainty in the parameters governing short‐term ecosystem processes remains the most significant bottleneck to reducing predictive uncertainty. Key actions to reduce parameter uncertainty include more leaf‐level trait measurements across multiple sites for quantum efficiency and leaf respiration rate. Further, the empirical representation of soil‐plant water transfer should be replaced with a mechanistic, hydraulic representation of water flow, which can be constrained with direct measurements. This analysis focused on aboveground ecosystem processes. The impact of belowground carbon cycling, initial conditions, and meteorological forcing should be addressed in future studies.

    more » « less
  2. The modern configuration of the South East Asian Islands (SEAI) evolved over the last fifteen million years, as a result of subduction, arc magmatism, and arc-continent collisions, contributing to both increased land area and high topography.  The presence of the additional land area has been postulated to enhance convective rainfall, facilitating both increased silicate weathering and the development of the modern-day Walker circulation.  Using an Earth System Model in conjunction with a climate-silicate weathering model, we argue instead for a significant role of SEAI topography for both effects.  This dataset archives model output used in this investigation, including simulations using the Community Earth System Model version 1.2, and the climate-silicate weathering model GEOCLIM. All data are in Netcdf format, and were generated either by the Community Earth System Model 1.2 (Hurrell et al. 2013) or the climate-silicate weathering model GEOCLIM (Park et al. 2020).  Model output is organized into 4 tar files: 1) B1850C5.tar Contains model output for the fully coupled CESM1.2 runs, for 2D fields and for 3D pressure vertical velocity (W) between 10S-10N.  Monthly mean data for years 41-110 of the simulations.   Naming convention is No SEAI topography: and 50% SEAI topography: and 100% SEAI topography: and 150% SEAO topogaphy: and 2) E1850C5.tar Contains model output for the slab ocean CESM1.2 runs, for 2D fields and for 3D pressure vertical velocity (W) between 10S-10N.  Monthly mean data for years 21-50 of the simulations.  Naming convention is No SEAI topography: and 50% SEAI topography: and 100% SEAI topography: and 150% SEAO topogaphy: and 3) GEOCLIM.tar Contains model output from the climate-silicate weathering model GEOCLIM.  Data is provided for all 573 parameter combinations.  All values are climatological annual means. All files contain these variables: GMST: global mean surface temperature (in K) atm_CO2_level: atmospheric pCO2 (in ppm) degassing: globally-integrated CO2 flux (in mol/yr) The files ending with also contain these spatial fields: lithology fraction: fraction of land covered by a lithology class erosion: Regolith erosion rate (m/yr) weathering: Ca-Mg weathering rate (mol/m^2/yr) - GEOCLIM output using the Modern SEAI simulation as input, and for CO2 fixed to 286.7ppm. - GEOCLIM output using the no SEAI simulation as input, and for CO2 fixed to 286.7ppm. - GEOCLIM output using the flat SEAI simulation as input, and for CO2 fixed to 286.7ppm. - GEOCLIM output using the no SEAI simulation as input, and CO2 adjusted so that system is in carbon flux equilibrium. - GEOCLIM output using the flat SEAI simulation as input, and CO2 adjusted so that system is in carbon flux equilibrium.   4) Surface.tar Contains land fraction and surface geopotential fields for the modern SEAI ( and no SEAI ( simulations References Hurrell, J.W., Holland, M.M., Gent, P.R., Ghan, S., Kay, J.E., Kushner, P.J., Lamarque, J.F., Large, W.G., Lawrence, D., Lindsay, K. and Lipscomb, W.H., 2013. The community earth system model: a framework for collaborative research. Bulletin of the American Meteorological Society, 94(9), pp.1339-1360. Park, Y., Maffre, P., Goddéris, Y., Macdonald, F.A., Anttila, E.S. and Swanson-Hysell, N.L., 2020. Emergence of the Southeast Asian islands as a driver for Neogene cooling. Proceedings of the National Academy of Sciences, 117(41), pp.25319-25326. 
    more » « less
  3. Abstract

    Earth system models (ESMs) rely on the calculation of canopy conductance in land surface models (LSMs) to quantify the partitioning of land surface energy, water, andCO2fluxes. This is achieved by scaling stomatal conductance,gw, determined from physiological models developed for leaves. Traditionally, models forgwhave been semi‐empirical, combining physiological functions with empirically determined calibration constants. More recently, optimization theory has been applied to modelgwinLSMs under the premise that it has a stronger grounding in physiological theory and might ultimately lead to improved predictive accuracy. However, this premise has not been thoroughly tested. Using original field data from contrasting forest systems, we compare a widely used empirical type and a more recently developed optimization‐typegwmodel, termedBBandMED, respectively. Overall, we find no difference between the two models when used to simulategwfrom photosynthesis data, or leaf gas exchange from a coupled photosynthesis‐conductance model, or gross primary productivity and evapotranspiration for aFLUXNETtower site with theCLM5 communityLSM. Field measurements reveal that the key fitted parameters forBBandMED,g1Bandg1M,exhibit strong species specificity in magnitude and sensitivity toCO2, andCLM5 simulations reveal that failure to include this sensitivity can result in significant overestimates of evapotranspiration for high‐CO2scenarios. Further, we show thatg1Bandg1Mcan be determined from meanci/ca(ratio of leaf intercellular to ambientCO2concentration). Applying this relationship withci/cavalues derived from a leaf δ13C database, we obtain a global distribution ofg1Bandg1M, and these values correlate significantly with mean annual precipitation. This provides a new methodology for global parameterization of theBBandMEDmodels inLSMs, tied directly to leaf physiology but unconstrained by spatial boundaries separating designated biomes or plant functional types.

    more » « less
  4. Abstract The terrestrial carbon cycle is a major source of uncertainty in climate projections. Its dominant fluxes, gross primary productivity (GPP), and respiration (in particular soil respiration, R S ), are typically estimated from independent satellite-driven models and upscaled in situ measurements, respectively. We combine carbon-cycle flux estimates and partitioning coefficients to show that historical estimates of global GPP and R S are irreconcilable. When we estimate GPP based on R S measurements and some assumptions about R S :GPP ratios, we found the resulted global GPP values (bootstrap mean $${149}_{-23}^{+29}$$ 149 − 23 + 29 Pg C yr −1 ) are significantly higher than most GPP estimates reported in the literature ( $${113}_{-18}^{+18}$$ 113 − 18 + 18 Pg C yr −1 ). Similarly, historical GPP estimates imply a soil respiration flux (Rs GPP , bootstrap mean of $${68}_{-8}^{+10}$$ 68 − 8 + 10 Pg C yr −1 ) statistically inconsistent with most published R S values ( $${87}_{-8}^{+9}$$ 87 − 8 + 9 Pg C yr −1 ), although recent, higher, GPP estimates are narrowing this gap. Furthermore, global R S :GPP ratios are inconsistent with spatial averages of this ratio calculated from individual sites as well as CMIP6 model results. This discrepancy has implications for our understanding of carbon turnover times and the terrestrial sensitivity to climate change. Future efforts should reconcile the discrepancies associated with calculations for GPP and Rs to improve estimates of the global carbon budget. 
    more » « less
  5. Abstract. The terrestrial carbon cycle plays a critical role in modulating the interactions of climate with the Earth system, but different models often make vastly different predictions of its behavior. Efforts to reduce model uncertainty have commonly focused on model structure, namely by introducing additional processes and increasing structural complexity. However, the extent to which increased structural complexity can directly improve predictive skill is unclear. While adding processes may improve realism, the resulting models are often encumbered by a greater number of poorly determined or over-generalized parameters. To guide efficient model development, here we map the theoretical relationship between model complexity and predictive skill. To do so, we developed 16 structurally distinct carbon cycle models spanning an axis of complexity and incorporated them into a model–data fusion system. We calibrated each model at six globally distributed eddy covariance sites with long observation time series and under 42 data scenarios that resulted in different degrees of parameter uncertainty. For each combination of site, data scenario, and model, we then predicted net ecosystem exchange (NEE) and leaf area index (LAI) for validation against independent local site data. Though the maximum model complexity we evaluated is lower than most traditional terrestrial biosphere models, the complexity range we explored provides universal insight into the inter-relationship between structural uncertainty, parametric uncertainty, and model forecast skill. Specifically, increased complexity only improves forecast skill if parameters are adequately informed (e.g., when NEE observations are used for calibration). Otherwise, increased complexity can degrade skill and an intermediate-complexity model is optimal. This finding remains consistent regardless of whether NEE or LAI is predicted. Our COMPLexity EXperiment (COMPLEX) highlights the importance of robust observation-based parameterization for land surface modeling and suggests that data characterizing net carbon fluxes will be key to improving decadal predictions of high-dimensional terrestrial biosphere models. 
    more » « less