Soil organic carbon (SOC) is a key indicator of soil health on croplands, as well as a potential lever for carbon sequestration in agriculture. This requires tools for understanding spatial and temporal variations in SOC content. Multispectral satellites provide data on bare soil reflectance which is influenced by SOC content. In this study, an extensive database of 34,418 soil analyses on 22,850 fields is leveraged to train a Machine-Learning model for SOC content prediction. The predictive covariates are derived from a bare soil composite of Sentinel-2 images over the Walloon region (Belgium) obtained from March to June over a three-year period (2019–2021) as well as some environmental covariates. We observe that multispectral data is complementary to environmental covariates for explaining spatial variability in SOC content. Through feature elimination relevant spectral features were identified: the normalized difference of band 3 (Green) and 2 (Blue); band 5 (Red-Edge) and 11 (SWIR1); band 11 (SWIR1) and 12 (SWIR2) and the reflectance in band 4 (Red). These spectral indices were combined with three environmental covariates: elevation, the agro-ecological zone and the fine fraction ( < 20 μ m) content. The resulting model predicts SOC content at field-level with an RMSE of 2.7 g C kg −1 and an R 2 of 0.56. Given this uncertainty, we conclude that multispectral data is insufficient for SOC content monitoring at parcel-level but is a tool to consider for SOC content mapping. The SOC content map can be used for regional SOC content estimates, after modeling the autocorrelation of the model errors. This offers the possibility to compare groups with different management practices or assess the average SOC content of fields in a soil conservation program compared to a regional baseline. • A dataset of 34,418 soil analyses was used to train models for SOC content prediction. • A model combining spectral and environmental variables results in an RMSE of 2.7 g C kg −1 . • Key spectral indices relate to absorption features around 560 nm, 700 nm and 2200 nm. • Model accuracy is insufficient for SOC content monitoring at parcel-level. • Parcel-level estimates can be aggregated to regional averages and uncertainty.
Bièvre et al. (Fri,) studied this question.