Depression is highly prevalent in elderly patients with gastrointestinal (GID) or chronic liver diseases (CLD), significantly impairing quality of life and treatment outcomes. This study aimed to develop and validate an interpretable machine learning (ML) model to identify depression risk in this population, overcoming the “black box” limitation of conventional ML. This prospective analysis utilized data from the baseline (2018) and follow-up (2020) waves of the China Health and Retirement Longitudinal Study (CHARLS). Potential predictors measured at baseline were selected via Least Absolute Shrinkage and Selection Operator (LASSO) regression. The outcome was incident depression at the 2020 follow-up, defined by a CES-D-10 score ≥ 10 among participants free of depression at baseline. Ten ML algorithms were employed to construct models. Performance was evaluated using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision, F1-score, calibration curves, and decision curve analysis. The SHapley Additive exPlanations (SHAP) framework interpreted feature contributions. Among 1,353 participants (424 with depression), LASSO identified 10 key predictors. The Logistic Regression (LR) model demonstrated optimal discriminative performance, with an AUC of 0.723 (95% CI: 0.674–0.772). SHAP analysis revealed the top five predictors: self-reported health, life satisfaction, gender, education, and memory scores. We developed an interpretable ML model for predicting depression risk in elderly patients with GID or CLD. This tool aids early detection and intervention, potentially improving clinical outcomes in this vulnerable population.
Chen et al. (Wed,) studied this question.