What question did this study set out to answer?

April 1, 2026Open Access

Validation and prediction challenges in data-driven landslide susceptibility mapping: insights from random sampling of training and test data in machine learning models

Key Points

This study investigates the uncertainties in landslide susceptibility modeling due to random data division.
Utilized an inventory of 304 landslides for analysis
Employed nine machine learning methods over 100 random resampling iterations
Applied 10-fold cross-validation to assess model performance
Used four validation metrics: Accuracy, AUROC, FPR, and FNR
Random Forest achieved an AUROC of 0.945 and 88.15% accuracy, making it the most efficient model
Artificial Neural Networks followed with AUROC of 0.939 and 87.68% accuracy
K-Nearest Neighbors showed an AUROC of 0.929 and 85.73% accuracy
Random Forest exhibited reduced performance when using spatial cross-validation compared to random cross-validation

Abstract

Abstract Results from stochastic models depend on the variability of the input dataset. This is particularly relevant in data-driven landslide susceptibility models, where inputs can vary depending on the methodological approach. Therefore, understanding the assumptions and limitations of these models is essential for accurately interpreting results and supporting land-use planning. This study explored uncertainties in landslide susceptibility modeling (LSM) resulting from the random division of input data, using an inventory of 304 landslides. The framework included nine machine learning methods, 100 random resampling iterations with 10-fold cross-validation, and four validation metrics: Accuracy (ACC), Area Under the ROC Curve (AUROC), False Positive Rate (FPR), and False Negative Rate (FNR). Statistical Significance Tests (SST) compared performance between methods. On average, the Random Forest (RF) emerged as the most efficient algorithm, achieving an AUROC of 0.945, ACC of 88.15%, FNR of 11.13%, and FPR of 12.58%. It was followed by Artificial Neural Networks (ANN), with AUROC = 0.939, ACC = 87.68%, FNR = 11.70%, and FPR = 12.95%. The K-Nearest Neighbors (KNN) also showed strong results, with AUROC of 0.929, ACC of 85.73%, FNR of 14.40%, and FPR of 14.14%. These three methods demonstrated more stable validation metrics, suggesting potential to reduce bias and variance. In contrast, Decision Tree (DT), Support Vector Machine (SVM), and Rule Learning (RL) showed higher variability and poorer performance. The SST confirmed RF as the most effective method, followed by ANN and KNN. However, the RF model exhibited a substantial decrease in predictive performance when evaluated using spatial rather than random cross-validation.

Bookmark

View Full Paper

Cite This Study

Barella et al. (Mon,) studied this question.

synapsesocial.com/papers/69ccb7c216edfba7beb89d0e https://doi.org/https://doi.org/10.1007/s10064-026-04911-5

Bookmark

View Full Paper