skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Upscaling Soil Organic Carbon Measurements at the Continental Scale Using Multivariate Clustering Analysis and Machine Learning
Abstract Estimates of soil organic carbon (SOC) stocks are essential for many environmental applications. However, significant inconsistencies exist in SOC stock estimates for the U.S. across current SOC maps. We propose a framework that combines unsupervised multivariate geographic clustering (MGC) and supervised Random Forests regression, improving SOC maps by capturing heterogeneous relationships with SOC drivers. We first used MGC to divide the U.S. into 20 SOC regions based on the similarity of covariates (soil biogeochemical, bioclimatic, biological, and physiographic variables). Subsequently, separate Random Forests models were trained for each SOC region, utilizing environmental covariates and SOC observations. Our estimated SOC stocks for the U.S. (52.6 ± 3.2 Pg for 0–30 cm and 108.3 ± 8.2 Pg for 0–100 cm depth) were within the range estimated by existing products like Harmonized World Soil Database, HWSD (46.7 Pg for 0–30 cm and 90.7 Pg for 0–100 cm depth) and SoilGrids 2.0 (45.7 Pg for 0–30 cm and 133.0 Pg for 0–100 cm depth). However, independent validation with soil profile data from the National Ecological Observatory Network showed that our approach (R2 = 0.51) outperformed the estimates obtained from Harmonized World Soil Database (R2 = 0.23) and SoilGrids 2.0 (R2 = 0.39) for the topsoil (0–30 cm). Uncertainty analysis (e.g., low representativeness and high coefficients of variation) identified regions requiring more measurements, such as Alaska and the deserts of the U.S. Southwest. Our approach effectively captures the heterogeneous relationships between widely available predictors and the current SOC baseline across regions, offering reliable SOC estimates at 1 km resolution for benchmarking Earth system models.  more » « less
Award ID(s):
2217817 2106137 2106138
PAR ID:
10493478
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Journal of Geophysical Research: Biogeosciences
Volume:
129
Issue:
2
ISSN:
2169-8953
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. {"Abstract":["Data Description<\/strong>:<\/p>\n\nTo improve SOC estimation in the United States, we upscaled site-based SOC measurements to the continental scale using multivariate geographic clustering (MGC) approach coupled with machine learning models. First, we used the MGC approach to segment the United States at 30 arc second resolution based on principal component information from environmental covariates (gNATSGO soil properties, WorldClim bioclimatic variables, MODIS biological variables, and physiographic variables) to 20 SOC regions. We then trained separate random forest model ensembles for each of the SOC regions identified using environmental covariates and soil profile measurements from the International Soil Carbon Network (ISCN) and an Alaska soil profile data. We estimated United States SOC for 0-30 cm and 0-100 cm depths were 52.6 + 3.2 and 108.3 + 8.2 Pg C, respectively.<\/p>\n\nFiles in collection (32):<\/p>\n\nCollection contains 22 soil properties geospatial rasters, 4 soil SOC geospatial rasters, 2 ISCN site SOC observations csv files, and 4 R scripts<\/p>\n\ngNATSGO TIF files:<\/p>\n\n├── available_water_storage_30arc_30cm_us.tif                   [30 cm depth soil available water storage]\n├── available_water_storage_30arc_100cm_us.tif                 [100 cm depth soil available water storage]\n├── caco3_30arc_30cm_us.tif                                                 [30 cm depth soil CaCO3 content]\n├── caco3_30arc_100cm_us.tif                                               [100 cm depth soil CaCO3 content]\n├── cec_30arc_30cm_us.tif                                                     [30 cm depth soil cation exchange capacity]\n├── cec_30arc_100cm_us.tif                                                   [100 cm depth soil cation exchange capacity]\n├── clay_30arc_30cm_us.tif                                                     [30 cm depth soil clay content]\n├── clay_30arc_100cm_us.tif                                                   [100 cm depth soil clay content]\n├── depthWT_30arc_us.tif                                                        [depth to water table]\n├── kfactor_30arc_30cm_us.tif                                                 [30 cm depth soil erosion factor]\n├── kfactor_30arc_100cm_us.tif                                               [100 cm depth soil erosion factor]\n├── ph_30arc_100cm_us.tif                                                      [100 cm depth soil pH]\n├── ph_30arc_100cm_us.tif                                                      [30 cm depth soil pH]\n├── pondingFre_30arc_us.tif                                                     [ponding frequency]\n├── sand_30arc_30cm_us.tif                                                    [30 cm depth soil sand content]\n├── sand_30arc_100cm_us.tif                                                  [100 cm depth soil sand content]\n├── silt_30arc_30cm_us.tif                                                        [30 cm depth soil silt content]\n├── silt_30arc_100cm_us.tif                                                      [100 cm depth soil silt content]\n├── water_content_30arc_30cm_us.tif                                      [30 cm depth soil water content]\n└── water_content_30arc_100cm_us.tif                                   [100 cm depth soil water content]<\/p>\n\nSOC TIF files:<\/p>\n\n├──30cm SOC mean.tif                             [30 cm depth soil SOC]\n├──100cm SOC mean.tif                           [100 cm depth soil SOC]\n├──30cm SOC CV.tif                                 [30 cm depth soil SOC coefficient of variation]\n└──100cm SOC CV.tif                              [100 cm depth soil SOC coefficient of variation]<\/p>\n\nsite observations csv files:<\/p>\n\nISCN_rmNRCS_addNCSS_30cm.csv       30cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data<\/p>\n\nISCN_rmNRCS_addNCSS_100cm.csv       100cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data<\/p>\n\n\nData format<\/strong>:<\/p>\n\nGeospatial files are provided in Geotiff format in Lat/Lon WGS84 EPSG: 4326 projection at 30 arc second resolution.<\/p>\n\nGeospatial projection<\/strong>: <\/p>\n\nGEOGCS["GCS_WGS_1984",\n DATUM["D_WGS_1984",\n SPHEROID["WGS_1984",6378137,298.257223563]],\n PRIMEM["Greenwich",0],\n UNIT["Degree",0.017453292519943295]]\n(base) [jbk@theseus ltar_regionalization]$ g.proj -w\nGEOGCS["wgs84",\n DATUM["WGS_1984",\n SPHEROID["WGS_1984",6378137,298.257223563]],\n PRIMEM["Greenwich",0],\n UNIT["degree",0.0174532925199433]]\n<\/code>\n\n <\/p>"]} 
    more » « less
  2. ABSTRACT The current soil carbon paradigm puts particulate organic carbon (POC) as one of the major components of soil organic carbon worldwide, highlighting its pivotal role in carbon mitigation. In this study, we compiled a global dataset of 3418 data points of POC concentration in soils and applied empirical modeling and machine learning algorithms to investigate the spatial variation in POC concentration and its controls. The global POC concentration in topsoil (0–30 cm) is estimated as 3.02 g C/kg dry soil, exhibiting a declining trend from polar regions to the equator. Boreal forests contain the highest POC concentration, averaging at 4.58 g C/kg dry soil, whereas savannas exhibit the lowest at 1.41 g C/kg dry soil. We developed a global map of soil POC density in soil profiles of 0‐30 cm and 0–100 cm with an empirical model. The global stock of POC is 158.15 Pg C for 0–30 cm and 222.75 Pg C for 0–100 cm soil profiles with a substantial spatial variation. Analysis with a machine learning algorithm concluded the predominate controls of edaphic factors (i.e., bulk density and soil C content) on POC concentration across biomes. However, the secondary controls vary among biomes, with solid climate controls in grassland, pasture, and shrubland, while strong vegetation controls in forests. The biome‐level estimates and maps of POC density provide a benchmark for modeling C fractions in soils; the various controls on POC suggest incorporating biological and physiochemical mechanisms in soil C models to assess and forecast the soil POC dynamics in response to global change. 
    more » « less
  3. Abstract. Soil represents the largest phosphorus (P) stock in terrestrialecosystems. Determining the amount of soil P is a critical first step inidentifying sites where ecosystem functioning is potentially limited by soilP availability. However, global patterns and predictors of soil total Pconcentration remain poorly understood. To address this knowledge gap, weconstructed a database of total P concentration of 5275 globallydistributed (semi-)natural soils from 761 published studies. We quantifiedthe relative importance of 13 soil-forming variables in predicting soiltotal P concentration and then made further predictions at the global scaleusing a random forest approach. Soil total P concentration variedsignificantly among parent material types, soil orders, biomes, andcontinents and ranged widely from 1.4 to 9630.0 (median 430.0 and mean570.0) mg kg−1 across the globe. About two-thirds (65 %) of theglobal variation was accounted for by the 13 variables that we selected,among which soil organic carbon concentration, parent material, mean annualtemperature, and soil sand content were the most important ones. Whilepredicted soil total P concentrations increased significantly with latitude,they varied largely among regions with similar latitudes due to regionaldifferences in parent material, topography, and/or climate conditions. SoilP stocks (excluding Antarctica) were estimated to be 26.8 ± 3.1 (mean ± standard deviation) Pg and 62.2 ± 8.9 Pg (1 Pg = 1 × 1015 g) in the topsoil (0–30 cm) and subsoil (30–100 cm), respectively.Our global map of soil total P concentration as well as the underlyingdrivers of soil total P concentration can be used to constraint Earth systemmodels that represent the P cycle and to inform quantification of globalsoil P availability. Raw datasets and global maps generated in this studyare available at https://doi.org/10.6084/m9.figshare.14583375(He et al., 2021). 
    more » « less
  4. Abstract Emerging evidence points out that the responses of soil organic carbon (SOC) to nitrogen (N) addition differ along the soil profile, highlighting the importance of synthesizing results from different soil layers. Here, using a global meta‐analysis, we found that N addition significantly enhanced topsoil (0–30 cm) SOC by 3.7% (±1.4%) in forests and grasslands. In contrast, SOC in the subsoil (30–100 cm) initially increased with N addition but decreased over time. The model selection analysis revealed that experimental duration and vegetation type are among the most important predictors across a wide range of climatic, environmental, and edaphic variables. The contrasting responses of SOC to N addition indicate the importance of considering deep soil layers, particularly for long‐term continuous N deposition. Finally, the lack of depth‐dependent SOC responses to N addition in experimental and modeling frameworks has likely resulted in the overestimation of changes in SOC storage under enhanced N deposition. 
    more » « less
  5. Abstract Soil is the source of the vast majority of food consumed on Earth, and soils constitute the largest terrestrial carbon pool. Soil erosion associated with agriculture reduces crop productivity, and the redistribution of soil organic carbon (SOC) by erosion has potential to influence the global carbon cycle. Tillage strongly influences the erosion and redistribution of soil and SOC. However, tillage is rarely considered in predictions of soil erosion in the U.S.; hence regionwide estimates of both the current magnitude and future trends of soil redistribution by tillage are unknown. Here we use a landscape evolution model to forecast soil and SOC redistribution in the Midwestern United States over centennial timescales. We predict that present‐day rates of soil and SOC erosion are 1.1 ± 0.4 kg ⋅ m−‐2 ⋅ yr−‐1and 12 ± 4 g ⋅ m−2 ⋅ yr−1, respectively, but these rates will rapidly decelerate due to diffusive evolution of topography and the progressive depletion of SOC in eroding soil profiles. After 100 years, we forecast that 8.8 (+1.9/−2.1) Pg of soil and 0.17 (+0.03/−0.04) Pg of SOC will have eroded, causing the surface concentration of SOC to decrease by 4.4% (+0.9/−1.1%). Model simulations that include more widespread adoption of low‐intensity tillage (i.e., no‐till farming) determine that soil redistribution, SOC redistribution, and surficial SOC loss after 100 years would decrease by ∼95% if low‐intensity tillage is fully adopted. Our findings indicate that low‐intensity tillage could greatly decrease soil degradation and that the potential for agricultural soil erosion to influence the global carbon cycle will diminish with time due to a reduction in SOC burial. 
    more » « less