Characterizing Spatial Variability of Soil Organic Carbon Through Improved Machine-Learning Modeling With In Situ Data Resampling: A Case Study in Alaska

Wei Peng, Yonghong Yi, Umakant Mishra, Kazem Bakian-Dogaheh, John S. Kimball, Mahta Moghaddam, Hans W. Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Sparse and unevenly distributed soil samples across the northern high-latitude region greatly limit the accuracy of soil organic carbon (SOC) mapping. Substantial discrepancies, therefore, exist in SOC estimation in this region, which makes it challenging to characterize the SOC spatial variability and its potential responses to climate change and permafrost degradation. In order to address these challenges, we enhanced a machine-learning model for SOC mapping by developing a data resampling approach that accounts for the soil samples’ spatial heterogeneity, using Alaska as a case study. Specifically, in situ SOC data were resampled with weights proportional to the variance within a 15-km radius and then fit using a random forest (RF) regression model. Multiple features, including temporal composites of Sentinel-1 C-band radar backscatter, vegetation indices from Sentinel-2, climate indices including thawing and freezing indices from moderate resolution imaging spectroradiometer (MODIS), and ancillary topography data, were selected as inputs for the RF model after recursive feature elimination (RFE) to generate top-layer (0–30 cm) SOC content (SOCC) maps in Alaska at a 250-m resolution. The enhanced RF model with data resampling showed improved accuracy compared to the original RF model, with the coefficient of determination ( R2 ) increased from 0.36 to 0.56 and the root-mean square error (RMSE) decreased from 16% to 11% for the surface (0–10 cm) SOCC, and slightly improved accuracy for the deeper (10–30 cm) SOCC. Additionally, the enhanced RF model also better captured local-scale variability of SOC than the original RF model and SoilGrids 2.0 dataset, with high-resolution remote sensing indices playing a major role. The improved SOCC estimates were then used to estimate soil bulk density (BD) and calculate total SOC stock for Alaska. Our results suggest that Alaskan topsoil (0–30 cm) stores approximately 25.21 ± 17.18 Pg C, with the largest SOC reserves found in shrublands. These findings highlight the importance of accounting for spatial heterogeneity in in situ samples and leveraging high-resolution remote sensing data for regional soil mapping.

Original languageEnglish
Article number4505414
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume63
DOIs
StatePublished - 2025

Keywords

  • Data resampling
  • machine learning
  • multisource remote sensing
  • soil organic carbon (SOC)

Fingerprint

Dive into the research topics of 'Characterizing Spatial Variability of Soil Organic Carbon Through Improved Machine-Learning Modeling With In Situ Data Resampling: A Case Study in Alaska'. Together they form a unique fingerprint.

Cite this