BACKGROUND: Adolescent suicide is a critical public health issue globally. Early detection of suicidal tendency remains challenging due to its concealed and multidimensional nature. This study aimed to develop and validate an interpretable machine learning model to predict suicidal tendency among Chinese secondary school students. METHODS: A cross-sectional survey was conducted among 12,063 students from Suzhou, China. A total of 23 variables, including demographic, psychological, and behavioral factors, were collected. Seven machine learning models (LR + LASSO, LightGBM, SVM, KNN, DT, RF, and XGBoost) were developed and compared using fivefold cross-validation. Model performance was evaluated using AUC, sensitivity, specificity, calibration curves, and decision curve analysis. Feature importance was interpreted using SHAP values. RESULTS: Among the participants, 21.98% exhibited suicidal tendency. XGBoost outperformed other models on the validation set, achieving an AUC of 0.802 (95% CI: 0.785-0.818), sensitivity of 0.686, specificity of 0.758, and a negative predictive value of 0.892. The top three predictors were depressed mood (PHQ2), self-dissatisfaction (PHQ6), and reluctance to seek help. SHAP analysis revealed that male students with high distress and low help-seeking intent constituted a high-risk subgroup. CONCLUSION: The XGBoost-based model demonstrates strong predictive ability and clinical interpretability for identifying adolescents at risk of suicide. It highlights the importance of integrating psychological and behavioral factors in school-based screening programs, particularly for under-recognized subgroups such as distressed males who avoid seeking help.
Yu et al. (Mon,) studied this question.