skip to main content


Title: Upscaling Soil Organic Carbon Measurements at the Continental Scale Using Multivariate Clustering Analysis and Machine Learning
Abstract

Estimates of soil organic carbon (SOC) stocks are essential for many environmental applications. However, significant inconsistencies exist in SOC stock estimates for the U.S. across current SOC maps. We propose a framework that combines unsupervised multivariate geographic clustering (MGC) and supervised Random Forests regression, improving SOC maps by capturing heterogeneous relationships with SOC drivers. We first used MGC to divide the U.S. into 20 SOC regions based on the similarity of covariates (soil biogeochemical, bioclimatic, biological, and physiographic variables). Subsequently, separate Random Forests models were trained for each SOC region, utilizing environmental covariates and SOC observations. Our estimated SOC stocks for the U.S. (52.6 ± 3.2 Pg for 0–30 cm and 108.3 ± 8.2 Pg for 0–100 cm depth) were within the range estimated by existing products like Harmonized World Soil Database, HWSD (46.7 Pg for 0–30 cm and 90.7 Pg for 0–100 cm depth) and SoilGrids 2.0 (45.7 Pg for 0–30 cm and 133.0 Pg for 0–100 cm depth). However, independent validation with soil profile data from the National Ecological Observatory Network showed that our approach (R2 = 0.51) outperformed the estimates obtained from Harmonized World Soil Database (R2 = 0.23) and SoilGrids 2.0 (R2 = 0.39) for the topsoil (0–30 cm). Uncertainty analysis (e.g., low representativeness and high coefficients of variation) identified regions requiring more measurements, such as Alaska and the deserts of the U.S. Southwest. Our approach effectively captures the heterogeneous relationships between widely available predictors and the current SOC baseline across regions, offering reliable SOC estimates at 1 km resolution for benchmarking Earth system models.

 
more » « less
Award ID(s):
2217817
NSF-PAR ID:
10494630
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Journal of Geophysical Research: Biogeosciences
Volume:
129
Issue:
2
ISSN:
2169-8953
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Data Description:

    To improve SOC estimation in the United States, we upscaled site-based SOC measurements to the continental scale using multivariate geographic clustering (MGC) approach coupled with machine learning models. First, we used the MGC approach to segment the United States at 30 arc second resolution based on principal component information from environmental covariates (gNATSGO soil properties, WorldClim bioclimatic variables, MODIS biological variables, and physiographic variables) to 20 SOC regions. We then trained separate random forest model ensembles for each of the SOC regions identified using environmental covariates and soil profile measurements from the International Soil Carbon Network (ISCN) and an Alaska soil profile data. We estimated United States SOC for 0-30 cm and 0-100 cm depths were 52.6 + 3.2 and 108.3 + 8.2 Pg C, respectively.

    Files in collection (32):

    Collection contains 22 soil properties geospatial rasters, 4 soil SOC geospatial rasters, 2 ISCN site SOC observations csv files, and 4 R scripts

    gNATSGO TIF files:

    ├── available_water_storage_30arc_30cm_us.tif                   [30 cm depth soil available water storage]
    ├── available_water_storage_30arc_100cm_us.tif                 [100 cm depth soil available water storage]
    ├── caco3_30arc_30cm_us.tif                                                 [30 cm depth soil CaCO3 content]
    ├── caco3_30arc_100cm_us.tif                                               [100 cm depth soil CaCO3 content]
    ├── cec_30arc_30cm_us.tif                                                     [30 cm depth soil cation exchange capacity]
    ├── cec_30arc_100cm_us.tif                                                   [100 cm depth soil cation exchange capacity]
    ├── clay_30arc_30cm_us.tif                                                     [30 cm depth soil clay content]
    ├── clay_30arc_100cm_us.tif                                                   [100 cm depth soil clay content]
    ├── depthWT_30arc_us.tif                                                        [depth to water table]
    ├── kfactor_30arc_30cm_us.tif                                                 [30 cm depth soil erosion factor]
    ├── kfactor_30arc_100cm_us.tif                                               [100 cm depth soil erosion factor]
    ├── ph_30arc_100cm_us.tif                                                      [100 cm depth soil pH]
    ├── ph_30arc_100cm_us.tif                                                      [30 cm depth soil pH]
    ├── pondingFre_30arc_us.tif                                                     [ponding frequency]
    ├── sand_30arc_30cm_us.tif                                                    [30 cm depth soil sand content]
    ├── sand_30arc_100cm_us.tif                                                  [100 cm depth soil sand content]
    ├── silt_30arc_30cm_us.tif                                                        [30 cm depth soil silt content]
    ├── silt_30arc_100cm_us.tif                                                      [100 cm depth soil silt content]
    ├── water_content_30arc_30cm_us.tif                                      [30 cm depth soil water content]
    └── water_content_30arc_100cm_us.tif                                   [100 cm depth soil water content]

    SOC TIF files:

    ├──30cm SOC mean.tif                             [30 cm depth soil SOC]
    ├──100cm SOC mean.tif                           [100 cm depth soil SOC]
    ├──30cm SOC CV.tif                                 [30 cm depth soil SOC coefficient of variation]
    └──100cm SOC CV.tif                              [100 cm depth soil SOC coefficient of variation]

    site observations csv files:

    ISCN_rmNRCS_addNCSS_30cm.csv       30cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data

    ISCN_rmNRCS_addNCSS_100cm.csv       100cm ISCN sites SOC replaced NRCS sites with NCSS centroid removed data


    Data format:

    Geospatial files are provided in Geotiff format in Lat/Lon WGS84 EPSG: 4326 projection at 30 arc second resolution.

    Geospatial projection

    GEOGCS["GCS_WGS_1984", DATUM["D_WGS_1984", SPHEROID["WGS_1984",6378137,298.257223563]], PRIMEM["Greenwich",0], UNIT["Degree",0.017453292519943295]] (base) [jbk@theseus ltar_regionalization]$ g.proj -w GEOGCS["wgs84", DATUM["WGS_1984", SPHEROID["WGS_1984",6378137,298.257223563]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433]]

     

     
    more » « less
  2. Abstract. Soil represents the largest phosphorus (P) stock in terrestrialecosystems. Determining the amount of soil P is a critical first step inidentifying sites where ecosystem functioning is potentially limited by soilP availability. However, global patterns and predictors of soil total Pconcentration remain poorly understood. To address this knowledge gap, weconstructed a database of total P concentration of 5275 globallydistributed (semi-)natural soils from 761 published studies. We quantifiedthe relative importance of 13 soil-forming variables in predicting soiltotal P concentration and then made further predictions at the global scaleusing a random forest approach. Soil total P concentration variedsignificantly among parent material types, soil orders, biomes, andcontinents and ranged widely from 1.4 to 9630.0 (median 430.0 and mean570.0) mg kg−1 across the globe. About two-thirds (65 %) of theglobal variation was accounted for by the 13 variables that we selected,among which soil organic carbon concentration, parent material, mean annualtemperature, and soil sand content were the most important ones. Whilepredicted soil total P concentrations increased significantly with latitude,they varied largely among regions with similar latitudes due to regionaldifferences in parent material, topography, and/or climate conditions. SoilP stocks (excluding Antarctica) were estimated to be 26.8 ± 3.1 (mean ± standard deviation) Pg and 62.2 ± 8.9 Pg (1 Pg = 1 × 1015 g) in the topsoil (0–30 cm) and subsoil (30–100 cm), respectively.Our global map of soil total P concentration as well as the underlyingdrivers of soil total P concentration can be used to constraint Earth systemmodels that represent the P cycle and to inform quantification of globalsoil P availability. Raw datasets and global maps generated in this studyare available at https://doi.org/10.6084/m9.figshare.14583375(He et al., 2021). 
    more » « less
  3. Abstract

    The “4 per 1,000” initiative was launched at the 21st Conference of the Parties (COP21) stimulating a long‐standing debate on the potential of no‐till (NT) to promote soil C sequestration. Previous reviews found little or no soil organic C (SOC) accrual in NT soils as compared with full inversion tillage when soils are sampled deeper than 30 cm. Here, we present the results of a global meta‐analysis of studies assessing SOC and total N (TN) storage and dynamics in NT and tilled soils from the most important agricultural regions of the world. Overall, our results show that NT soils stored 6.7 ± 1.9 Mg C ha–1and 1.1 ± 0.4 Mg N ha–1more than tilled soils (0‐to‐100‐cm depth) with an average of 16 yr of NT, in contrast with previous findings. However, C sequestration (+4.7 ± 1.9 Mg C ha–1in the 0‐to‐60‐cm depth with an average of 11 yr of NT) depended on the association of NT with increased crop frequency and the inclusion of legumes cover crops. Single‐cropping systems lack the necessary C inputs to offset SOC losses in the soil profile (below 30‐cm depth). However, double‐cropping systems decreased soil TN that may constrain future C sequestration. The use of legumes alleviated TN loss and supported soil C sequestration. Briefly, our findings indicate that NT can avoid SOC losses from tilled soils, partially offsetting CO2emissions from agriculture. Moreover, NT with agricultural intensification can promote soil C sequestration, thus contributing to soil quality, food security, and adaptation to climate change.

     
    more » « less
  4. Abstract

    Tidal marshes store large amounts of organic carbon in their soils. Field data quantifying soil organic carbon (SOC) stocks provide an important resource for researchers, natural resource managers, and policy-makers working towards the protection, restoration, and valuation of these ecosystems. We collated a global dataset of tidal marsh soil organic carbon (MarSOC) from 99 studies that includes location, soil depth, site name, dry bulk density, SOC, and/or soil organic matter (SOM). The MarSOC dataset includes 17,454 data points from 2,329 unique locations, and 29 countries. We generated a general transfer function for the conversion of SOM to SOC. Using this data we estimated a median (± median absolute deviation) value of 79.2 ± 38.1 Mg SOC ha−1in the top 30 cm and 231 ± 134 Mg SOC ha−1in the top 1 m of tidal marsh soils globally. This data can serve as a basis for future work, and may contribute to incorporation of tidal marsh ecosystems into climate change mitigation and adaptation strategies and policies.

     
    more » « less
  5. Abstract

    Soil is the source of the vast majority of food consumed on Earth, and soils constitute the largest terrestrial carbon pool. Soil erosion associated with agriculture reduces crop productivity, and the redistribution of soil organic carbon (SOC) by erosion has potential to influence the global carbon cycle. Tillage strongly influences the erosion and redistribution of soil and SOC. However, tillage is rarely considered in predictions of soil erosion in the U.S.; hence regionwide estimates of both the current magnitude and future trends of soil redistribution by tillage are unknown. Here we use a landscape evolution model to forecast soil and SOC redistribution in the Midwestern United States over centennial timescales. We predict that present‐day rates of soil and SOC erosion are 1.1 ± 0.4 kg ⋅ m−‐2 ⋅ yr−‐1and 12 ± 4 g ⋅ m−2 ⋅ yr−1, respectively, but these rates will rapidly decelerate due to diffusive evolution of topography and the progressive depletion of SOC in eroding soil profiles. After 100 years, we forecast that 8.8 (+1.9/−2.1) Pg of soil and 0.17 (+0.03/−0.04) Pg of SOC will have eroded, causing the surface concentration of SOC to decrease by 4.4% (+0.9/−1.1%). Model simulations that include more widespread adoption of low‐intensity tillage (i.e., no‐till farming) determine that soil redistribution, SOC redistribution, and surficial SOC loss after 100 years would decrease by ∼95% if low‐intensity tillage is fully adopted. Our findings indicate that low‐intensity tillage could greatly decrease soil degradation and that the potential for agricultural soil erosion to influence the global carbon cycle will diminish with time due to a reduction in SOC burial.

     
    more » « less