Characterizing landslides in a tectonically active regime like the Himalaya is challenging, particularly in the Dibang Valley of northeast (NE) India, where complex geology, erratic rainfall, steep terrain, slope instability, and recent anthropogenic activities enhance the risk vulnerability. While landslide susceptibility has been widely studied worldwide, it has been studied on a limited scale in NE India despite high vulnerability. To address this gap, an attempt has been made to explore the landslide susceptibility in the region using a machine learning-based framework comprising eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) to simulate the influence of multiple conditioning factors. Both XGBoost and LightGBM demonstrated high predictive performance (AUC ~ 0.96) and predicted maximum landslide susceptibility of 25.48% (XGBoost) to 35% (LightGBM), respectively, in the area. The region around the NH-313 road section along the Dibang River and settlements around Anini, Punli, and Etalin are highly vulnerable. The analysis is further supported by a comprehensive landslide inventory derived from multisource datasets, which highlights an increase in soil moisture content during the monsoon period, thereby affecting slope stability. To determine the contribution of individual conditioning factors in the model prediction, the SHapley Additive ExPlanations (SHAP) method was employed. The results suggested that the geospatial and hydro-meteorological variables, including elevation, lithology, lineament density, and rainfall, significantly influence the predicted estimates. The methodology adopted here is robust in nature, and findings support the need for risk management, early warning systems, and hazard mitigation strategies on a long-term basis, especially given the impact of infrastructure projects like the Etalin Hydropower Project on the spatial and temporal dynamics of landslides in the region. This visual summary serves as a pivotal entry point into the research, offering a concise overview of the study’s core findings and methodologies. The Data section illustrates the Dibang Valley study area and the multiple datasets employed, including satellite imagery, rainfall, soil moisture, digital elevation models, and landslide inventories. The Analyses panel highlights the geospatial and statistical procedures used to process these datasets and select relevant conditioning factors such as slope, lithology, lineament density, and NDVI. The Model component leverages two gradient boosting algorithms, XGBoost and LightGBM, integrated with SHAP explainability to provide transparent insights into the factors driving landslide susceptibility. The Results section displays susceptibility maps with clear spatial patterns, showing that 25.48% to 35% of the Dibang Valley falls within high and very high risk zones, with predictive accuracy of AUC ≈ 0.96. Finally, the Conclusion emphasizes the applied relevance of these findings, demonstrating how they inform early warning systems, infrastructure planning, and community resilience in landslide-prone areas. Collectively, the graphical abstract encapsulates the progression from data to decision-making, presenting the integration of explainable machine learning and geospatial analysis as a robust framework for understanding and mitigating precipitation-induced landslide hazards in the Eastern Himalayas. Explainable ML (XGBoost, LightGBM) applied to landslide susceptibility in Dibang Valley. SHAP analysis reveals rainfall, slope, and lithology as dominant triggering factors. XGBoost: 25.48% and LightGBM: 35% of the area show high to very high susceptibility. Models achieve high predictive accuracy (AUC ≈ 0.96) for precipitation-induced landslides. Results support early warning, infrastructure planning, and community resilience.
Mihu et al. (Wed,) studied this question.