Abstract Estimates of soil organic carbon (SOC) stocks are essential for many environmental applications. However, significant inconsistencies exist in SOC stock estimates for the U.S. across current SOC maps. We propose a framework that combines unsupervised multivariate geographic clustering (MGC) and supervised Random Forests regression, improving SOC maps by capturing heterogeneous relationships with SOC drivers. We first used MGC to divide the U.S. into 20 SOC regions based on the similarity of covariates (soil biogeochemical, bioclimatic, biological, and physiographic variables). Subsequently, separate Random Forests models were trained for each SOC region, utilizing environmental covariates and SOC observations. Our estimated SOC stocks for the U.S. (52.6 ± 3.2 Pg for 0–30 cm and 108.3 ± 8.2 Pg for 0–100 cm depth) were within the range estimated by existing products like Harmonized World Soil Database, HWSD (46.7 Pg for 0–30 cm and 90.7 Pg for 0–100 cm depth) and SoilGrids 2.0 (45.7 Pg for 0–30 cm and 133.0 Pg for 0–100 cm depth). However, independent validation with soil profile data from the National Ecological Observatory Network showed that our approach (R2 = 0.51) outperformed the estimates obtained from Harmonized World Soil Database (R2 = 0.23) and SoilGrids 2.0 (R2 = 0.39) for the topsoil (0–30 cm). Uncertainty analysis (e.g., low representativeness and high coefficients of variation) identified regions requiring more measurements, such as Alaska and the deserts of the U.S. Southwest. Our approach effectively captures the heterogeneous relationships between widely available predictors and the current SOC baseline across regions, offering reliable SOC estimates at 1 km resolution for benchmarking Earth system models.
more »
« less
Upscaling soil organic carbon measurements at the continental scale using multivariate clustering analysis and machine learning
{"Abstract":["Data Description<\/strong>:<\/p>\n\nTo improve SOC estimation in the United States, we upscaled site-based SOC measurements to the continental scale using multivariate geographic clustering (MGC) approach coupled with machine learning models. First, we used the MGC approach to segment the United States at 30 arc second resolution based on principal component information from environmental covariates (gNATSGO soil properties, WorldClim bioclimatic variables, MODIS biological variables, and physiographic variables) to 20 SOC regions. We then trained separate random forest model ensembles for each of the SOC regions identified using environmental covariates and soil profile measurements from the International Soil Carbon Network (ISCN) and an Alaska soil profile data. We estimated United States SOC for 0-30 cm and 0-100 cm depths were 52.6 + 3.2 and 108.3 + 8.2 Pg C, respectively.<\/p>\n\nFiles in collection (32):<\/p>\n\nCollection contains 22 soil properties geospatial rasters, 4 soil SOC geospatial rasters, 2 ISCN site SOC observations csv files, and 4 R scripts<\/p>\n\ngNATSGO TIF files:<\/p>\n\n├── available_water_storage_30arc_30cm_us.tif [30 cm depth soil available water storage]\n├── available_water_storage_30arc_100cm_us.tif [100 cm depth soil available water storage]\n├── caco3_30arc_30cm_us.tif [30 cm depth soil CaCO3 content]\n├── caco3_30arc_100cm_us.tif [100 cm depth soil CaCO3 content]\n├── cec_30arc_30cm_us.tif [30 cm depth soil cation exchange capacity]\n├── cec_30arc_100cm_us.tif [100 cm depth soil cation exchange capacity]\n├── clay_30arc_30cm_us.tif [30 cm depth soil clay content]\n├── clay_30arc_100cm_us.tif [100 cm depth soil clay content]\n├── depthWT_30arc_us.tif [depth to water table]\n├── kfactor_30arc_30cm_us.tif [30 cm depth soil erosion factor]\n├── kfactor_30arc_100cm_us.tif [100 cm depth soil erosion factor]\n├── ph_30arc_100cm_us.tif [100 cm depth soil pH]\n├── ph_30arc_100cm_us.tif [30 cm depth soil pH]\n├── pondingFre_30arc_us.tif [ponding frequency]\n├── sand_30arc_30cm_us.tif [30 cm depth soil sand content]\n├── sand_30arc_100cm_us.tif [100 cm depth soil sand content]\n├── silt_30arc_30cm_us.tif [30 cm depth soil silt content]\n├── silt_30arc_100cm_us.tif [100 cm depth soil silt content]\n├── water_content_30arc_30cm_us.tif [30 cm depth soil water content]\n└── water_content_30arc_100cm_us.tif [100 cm depth soil water content]<\/p>\n\nSOC TIF files:<\/p>\n\n├──30cm SOC mean.tif [30 cm depth soil SOC]\n├──100cm SOC mean.tif [100 cm depth soil SOC]\n├──30cm SOC CV.tif [30 cm depth soil SOC coefficient of variation]\n└──100cm SOC CV.tif [100 cm depth soil SOC coefficient of variation]<\/p>\n\nsite observations csv files:<\/p>\n\nISCN_rmNRCS_addNCSS_30cm.csv 30cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data<\/p>\n\nISCN_rmNRCS_addNCSS_100cm.csv 100cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data<\/p>\n\n\nData format<\/strong>:<\/p>\n\nGeospatial files are provided in Geotiff format in Lat/Lon WGS84 EPSG: 4326 projection at 30 arc second resolution.<\/p>\n\nGeospatial projection<\/strong>: <\/p>\n\n
GEOGCS["GCS_WGS_1984",\n DATUM["D_WGS_1984",\n SPHEROID["WGS_1984",6378137,298.257223563]],\n PRIMEM["Greenwich",0],\n UNIT["Degree",0.017453292519943295]]\n(base) [jbk@theseus ltar_regionalization]$ g.proj -w\nGEOGCS["wgs84",\n DATUM["WGS_1984",\n SPHEROID["WGS_1984",6378137,298.257223563]],\n PRIMEM["Greenwich",0],\n UNIT["degree",0.0174532925199433]]\n<\/code>\n\n <\/p>"]}
more »
« less
- PAR ID:
- 10430670
- Publisher / Repository:
- Zenodo
- Date Published:
- Subject(s) / Keyword(s):
- the United States SOC US soil properties Gridded National Soil Survey Geographic Database gNATSGO International Soil Carbon Network (ISCN)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Emerging evidence points out that the responses of soil organic carbon (SOC) to nitrogen (N) addition differ along the soil profile, highlighting the importance of synthesizing results from different soil layers. Here, using a global meta‐analysis, we found that N addition significantly enhanced topsoil (0–30 cm) SOC by 3.7% (±1.4%) in forests and grasslands. In contrast, SOC in the subsoil (30–100 cm) initially increased with N addition but decreased over time. The model selection analysis revealed that experimental duration and vegetation type are among the most important predictors across a wide range of climatic, environmental, and edaphic variables. The contrasting responses of SOC to N addition indicate the importance of considering deep soil layers, particularly for long‐term continuous N deposition. Finally, the lack of depth‐dependent SOC responses to N addition in experimental and modeling frameworks has likely resulted in the overestimation of changes in SOC storage under enhanced N deposition.more » « less
-
Soil nitrogen (N) is an important driver of plant productivity and ecosystem functioning; consequently, it is critical to understand its spatial variability from local-to-global scales. Here we provide a quantitative assessment of the three-dimensional spatial distribution of soil N across the conterminous United States (CONUS) using a digital soil mapping (DSM) approach. We used a random forest-regression kriging algorithm to predict soil N concentrations and associated uncertainty across six soil depths (0-5, 5-15, 15-30, 30-60, 60-100, 100-200 cm) at 5 km spatial grids. Across CONUS, there is a strong spatial dependence of soil N, where soil N concentrations decrease but uncertainty increases with soil depth. Soil N was higher in Pacific Northwest, Northeast, and Great Lakes National Ecological Observatory Network (NEON) ecoclimatic domains. Model uncertainty was higher in Atlantic Neotropical, Southern Rockies/Colorado Plateau and Southeast NEON domains. We also compared our soil N predictions with satellite-derived gross primary production (GPP) and forest biomass from the National Biomass and Carbon Dataset. Finally, we used uncertainty information to propose optimized locations for designing future soil surveys and found that the Atlantic Neotropical, Pacific Northwest, Pacific Southwest, and Appalachian/Cumberland Plateau NEON domains may require larger survey efforts. We highlight the need to increase knowledge of biophysical factors regulating soil processes at deeper depths to better characterize the three-dimensional space of soils. Our results provide a national benchmark regarding the spatial variability and uncertainty of soil N and reveal areas in need of a better representation.</p></p>This dataset includes all covariates used for modeling soil Nitrogen, the training data, and the modeling output. The output represents raster files at 5km resolution of soil N at different depths and associated model uncertainty.</p></p>Main reference:</p>Smith EM, Guevara M, Tarin T, Pouyat R, Vargas R. Spatial variability and uncertainty of soil nitrogen across the conterminous United States (in review). Ecosphere.</p>more » « less
-
Objectives:Fine roots significantly influence ecosystem-scale cycling of nutrients, carbon (C), and water, yet there is limited understanding of how fine root traits vary across and within tropical forests, some of Earth's most C-rich ecosystems. The biomass of fine roots can impact soil carbon storage, as root mortality is a primary source of new carbon to soils. A positive relationship has been observed between fine root biomass and soil carbon stocks in Panama (Cusack et al 2018). Beyond biomass, root characteristics like specific root length (SRL) could also influence soil carbon, as roots with higher SRL are less dense and thinner, potentially decomposing more easily or promoting soil aggregation. Understanding the effects of root morphology and tissue quality on soil carbon storage and with soil properties in general can improve predictions of landscape-scale carbon patterns. We aggregated new data of root biomass, morphology and nutrient content at 0-10 cm, 10-20 cm, 20-50 cm and 50-100 cm depth increments across four distinct lowland Panamanian forests and paired with already published datasets (Cusack et al 2018; Cusack and Turner 2020) of soil chemistry from the same sites and soil depths to explore relationship between soil carbon stocks and root characteristics.Datasets included:The datasets provided include .csv and .xlsx files for fine root characteristics and soil chemistry from four different forests across 0-10 cm, 10-20 cm, 20-50 cm, and 50-100 cm depth increments. Root characteristics include live fine root biomass, dead fine root biomass, coarse root biomass, specific root length, root diameter, root tissue density, specific root area, root %N, root %C, and root C/N ratio. Soil chemistry data includes total carbon (TC), dissolved organic carbon (DOC), bulk density, total phosphorus (TP), available phosphorus (AEM Pi), and various Mehlich-extractable elements such as aluminum, calcium, iron, potassium, manganese, phosphorus, and zinc. Nitrogen content measures include ammonium, nitrate, total dissolved nitrogen (TDN), dissolved inorganic nitrogen (DIN), and dissolved organic nitrogen (DON). The dataset also includes total exchangeable bases (TEB) and effective cation exchange capacity (ECEC) in both centimoles of charge per kilogram and micromoles of charge per gram. The soil chemistry data was obtained from Cusack et al (2018) and Cusack and Turner (2020) and paired with root characteristics data for the same depth increments and sites. Additionally, a .kml file is provided with coordinates for all 32 plots included in the study across four forests (n = 8 plots per site). Root data was averaged across these 8 plots per site and soil data was collected in one pit in each site. This dataset serves as baseline data before a throughfall exclusion experiment, Panama Rainforest Changes with Experimental Drying (PARCHED), was implemented. No special software is needed to open these files.more » « less
-
This study explores machine learning for estimating soil moisture at multiple depths (0–5 cm, 0–10 cm, 0–20 cm, 0–50 cm, and 0–100 cm) across the coterminous United States. A framework is developed that integrates soil moisture from Soil Moisture Active Passive (SMAP), precipitation from the Global Precipitation Measurement (GPM), evapotranspiration from the Ecosystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS), vegetation data from the Moderate Resolution Imaging Spectroradiometer (MODIS), soil properties from gridded National Soil Survey Geographic (gNATSGO), and land cover information from the National Land Cover Database (NLCD). Five machine learning algorithms are evaluated including the feed-forward artificial neural network, random forest, extreme gradient boosting (XGBoost), Categorical Boosting, and Light Gradient Boosting Machine. The methods are tested by comparing to in situ soil moisture observations from several national and regional networks. XGBoost exhibits the best performance for estimating soil moisture, achieving higher correlation coefficients (ranging from 0.76 at 0–5 cm depth to 0.86 at 0–100 cm depth), lower root mean squared errors (from 0.024 cm3/cm3 at 0–100 cm depth to 0.039 cm3/cm3 at 0–5 cm depth), higher Nash–Sutcliffe Efficiencies (from 0.551 at 0–5 cm depth to 0.694 at 0–100 cm depth), and higher Kling–Gupta Efficiencies (0.511 at 0–5 cm depth to 0.696 at 0–100 cm depth). Additionally, XGBoost outperforms the SMAP Level 4 product in representing the time series of soil moisture for the networks. Key factors influencing the soil moisture estimation are elevation, clay content, aridity index, and antecedent soil moisture derived from SMAP.more » « less
An official website of the United States government
