Interpretable Attribution of Sentinel-1/2 and Environmental Covariates for Compositionally Closed Soil Mapping and Uncertainty Quantification
Wenhao Wang, Chao Dong, Bin Zhao, Yanling Li, Zhuoran Wang, Chunyan ChangSoil particle size fractions (PSFs)—sand, silt, and clay—are fundamental determinants of soil hydrological behavior, nutrient retention, and erodibility, yet their spatial prediction remains challenging due to the compositional nature of the data, unquantified prediction uncertainty, and limited interpretability of machine learning models. This study develops an integrated compositional mapping framework incorporating multi-source Sentinel-1/2 and topographic covariates, coupling the isometric log-ratio (ILR) transformation with Quantile Regression Forests (QRFs), a Monte Carlo simulation (MCS)-based latent-to-physical space uncertainty propagation strategy, and a Wrapper-SHAP attribution method to jointly address these challenges. The framework was evaluated across regional croplands in the central Shandong mountain-hilly region of China, using an elevation-stratified spatial cross-validation. Validations achieved R2 values of 0.72, 0.61, and 0.59 for sand, silt, and clay, respectively, and a global Aitchison distance of 0.34. Critically, the MCS error propagation strategy effectively compensated for the probability distribution shift introduced by non-linear ILR back-transformation. This ensured that all predicted compositions strictly satisfied compositional closure and the [0, 100%] constraint, while aligning the prediction interval coverage probability (PICP) of each fraction closely with the 90% nominal level. Wrapper-SHAP overcame direct attribution limitations in compositional models, revealing the predictive associations of these multi-source covariates: high remote sensing-derived Bare Soil Index (BSI) and Moisture Stress Index (MSI) values primarily exhibited strong predictive associations with sand enrichment, whereas their lower values, combined with elevated Normalized Difference Moisture Index (NDMI), Enhanced Vegetation Index (EVI), and anthropogenic indicators, favored silt and clay accumulation. The proposed framework provides a transferable methodological reference for remote sensing-integrated compositional soil mapping with reliable uncertainty estimates and interpretable driver identification at regional scales.