skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: The PEcAn+SIPNET Terrestrial Carbon Cycle Reanalysis: Development and Validation
Improving our ability to understand and predict the dynamics of the terrestrial carbon cycle remains a pressing challenge despite a rapidly growing volume and diversity of Earth Observation data. State data assimilation represents a path forward via an iterative cycle of making process-based forecasts and then statistically reconciling these forecasts against numerous ground-based and remotely-sensed data constraints into a “reanalysis” data product that provides full spatiotemporal carbon budgets with robust uncertainty accounting. Here we report on an >100x expansion of the PEcAn+SIPNET reanalysis from 500 sites CONUS, 25 ensemble members, and 2 data constraints to 6400 sites across North America, 100 ensemble members, and 5 data constraints: GEDI and Landtrendr AGB, MODIS LAI, SoilGrids Soil C, and SMAP soil moisture. We also report on an ensemble-based machine learning (ML) downscaling to a 1km product that preserves spatial, temporal, and across-variable covariances and demonstrate the impacts of these covariances on uncertainty accounting (Fig. 1). Synergistically, we use the same ML models to assess what climate, vegetation, and soil variables explain the spatiotemporal variability in different C pools and fluxes. In addition, we review a wide range of ongoing validation activities, comparing the outputs of the reanalysis against withheld data from: Ameriflux and NEON NEE and LE; USFS Forest Inventory biomass, biomass increment, tree rings, soil C, and litter; and NEON soil C and soil respiration. Finally, we touch on ML analyses to diagnose and correct systematic biases and emulator-based recalibration efforts.  more » « less
Award ID(s):
2406258
PAR ID:
10604937
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
eLTER Annual Meeting
Date Published:
Journal Name:
ARPHA Conference Abstracts
Volume:
8
ISSN:
2603-3925
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Rangelands provide significant environmental benefits through many ecosystem services, which may include soil organic carbon (SOC) sequestration. However, quantifying SOC stocks and monitoring carbon (C) fluxes in rangelands are challenging due to the considerable spatial and temporal variability tied to rangeland C dynamics as well as limited data availability. We developed the Rangeland Carbon Tracking and Management (RCTM) system to track long‐term changes in SOC and ecosystem C fluxes by leveraging remote sensing inputs and environmental variable data sets with algorithms representing terrestrial C‐cycle processes. Bayesian calibration was conducted using quality‐controlled C flux data sets obtained from 61 Ameriflux and NEON flux tower sites from Western and Midwestern US rangelands to parameterize the model according to dominant vegetation classes (perennial and/or annual grass, grass‐shrub mixture, and grass‐tree mixture). The resulting RCTM system produced higher model accuracy for estimating annual cumulative gross primary productivity (GPP) (R2 > 0.6, RMSE <390 g C m−2) relative to net ecosystem exchange of CO2(NEE) (R2 > 0.4, RMSE <180 g C m−2). Model performance in estimating rangeland C fluxes varied by season and vegetation type. The RCTM captured the spatial variability of SOC stocks withR2 = 0.6 when validated against SOC measurements across 13 NEON sites. Model simulations indicated slightly enhanced SOC stocks for the flux tower sites during the past decade, which is mainly driven by an increase in precipitation. Future efforts to refine the RCTM system will benefit from long‐term network‐based monitoring of vegetation biomass, C fluxes, and SOC stocks. 
    more » « less
  2. In this study, nine different statistical models are constructed using different combinations of predictors, including models with and without projected predictors. Multiple machine learning (ML) techniques are employed to optimize the ensemble predictions by selecting the top performing ensemble members and determining the weights for each ensemble member. The ML-Optimized Ensemble (ML-OE) forecasts are evaluated against the Simple-Averaging Ensemble (SAE) forecasts. The results show that for the response variables that are predicted with significant skill by individual ensemble members and SAE, such as Atlantic tropical cyclone counts, the performance of SAE is comparable to the best ML-OE results. However, for response variables that are poorly modeled by individual ensemble members, such as Atlantic and Gulf of Mexico major hurricane counts, ML-OE predictions often show higher skill score than individual model forecasts and the SAE predictions. However, neither SAE nor ML-OE was able to improve the forecasts of the response variables when all models show consistent bias. The results also show that increasing the number of ensemble members does not necessarily lead to better ensemble forecasts. The best ensemble forecasts are from the optimally combined subset of models. 
    more » « less
  3. Abstract Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as post-processing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and two-meter temperature two weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multi-model approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability. 
    more » « less
  4. Abstract Accurate quantification of soil carbon fluxes is essential to reduce uncertainty in estimates of the terrestrial carbon sink. However, these fluxes vary over time and across ecosystem types and so, it can be difficult to estimate them accurately across large scales. The flux‐gradient method estimates soil carbon fluxes using co‐located measurements of soil CO2concentration, soil temperature, soil moisture and other soil properties. The National Ecological Observatory Network (NEON) provides such data across 20 ecoclimatic domains spanning the continental U.S., Puerto Rico, Alaska and Hawai‘i.We present an R software package (neonSoilFlux) that acquires soil environmental data to compute half‐hourly soil carbon fluxes for each soil replicate plot at a given terrestrial NEON site. To assess the computed fluxes, we visited six focal NEON sites and measured soil carbon fluxes using a closed‐dynamic chamber approach.Outputs from theneonSoilFluxshowed agreement with measured fluxes (R2between measured andneonSoilFluxoutputs ranging from 0.12 to 0.77 depending on calculation method used); measured outputs generally fell within the range of calculated uncertainties from the gradient method. Calculated fluxes fromneonSoilFluxaggregated to the daily scale exhibited expected site‐specific seasonal patterns.While the flux‐gradient method is broadly effective, its accuracy is highly sensitive to site‐specific inputs, including the extent to which gap‐filing techniques are used to interpolate missing sensor data and to estimates of soil diffusivity and moisture content. Future refinement and validation ofneonSoilFluxoutputs can contribute to existing databases of soil carbon flux measurements, providing near real‐time estimates of a critical component of the terrestrial carbon cycle. 
    more » « less
  5. Soil nitrogen (N) is an important driver of plant productivity and ecosystem functioning; consequently, it is critical to understand its spatial variability from local-to-global scales. Here we provide a quantitative assessment of the three-dimensional spatial distribution of soil N across the conterminous United States (CONUS) using a digital soil mapping (DSM) approach. We used a random forest-regression kriging algorithm to predict soil N concentrations and associated uncertainty across six soil depths (0-5, 5-15, 15-30, 30-60, 60-100, 100-200 cm) at 5 km spatial grids. Across CONUS, there is a strong spatial dependence of soil N, where soil N concentrations decrease but uncertainty increases with soil depth. Soil N was higher in Pacific Northwest, Northeast, and Great Lakes National Ecological Observatory Network (NEON) ecoclimatic domains. Model uncertainty was higher in Atlantic Neotropical, Southern Rockies/Colorado Plateau and Southeast NEON domains. We also compared our soil N predictions with satellite-derived gross primary production (GPP) and forest biomass from the National Biomass and Carbon Dataset. Finally, we used uncertainty information to propose optimized locations for designing future soil surveys and found that the Atlantic Neotropical, Pacific Northwest, Pacific Southwest, and Appalachian/Cumberland Plateau NEON domains may require larger survey efforts. We highlight the need to increase knowledge of biophysical factors regulating soil processes at deeper depths to better characterize the three-dimensional space of soils. Our results provide a national benchmark regarding the spatial variability and uncertainty of soil N and reveal areas in need of a better representation.</p></p>This dataset includes all covariates used for modeling soil Nitrogen, the training data, and the modeling output. The output represents raster files at 5km resolution of soil N at different depths and associated model uncertainty.</p></p>Main reference:</p>Smith EM, Guevara M, Tarin T, Pouyat R, Vargas R. Spatial variability and uncertainty of soil nitrogen across the conterminous United States (in review). Ecosphere.</p> 
    more » « less