Objective To develop and interpret an interpretable machine learning model for classifying HIV viral load suppression (VLS) using routinely collected clinical data in a low-resource Ethiopian cohort, enabling early identification of patients at risk of treatment failure. Methods A retrospective cohort study was conducted using electronic medical records of 4,152 patients on antiretroviral therapy (ART) at the University of Gondar Comprehensive Specialized Hospital, Ethiopia (March 2005–December 2024). Eight machine learning algorithms, Logistic Regression, Random Forest, Gradient Boosting, Naive Bayes, Support Vector Machine, K-Nearest Neighbors, Decision Tree, and XGBoost, were trained and optimized to classify binary VLS outcomes. Model performance was evaluated using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). The best-performing model was interpreted using SHapley Additive exPlanations (SHAP) to identify significant predictors and their directional impacts. Results The optimized Gradient Boosting model achieved the highest performance with 76% accuracy, 0.74 F1-score, and 0.79 AUC-ROC. Baseline CD4 category and duration on ART (months) emerged as the most influential predictors. SHAP analysis revealed that longer ART duration and higher baseline CD4 count were associated with increased odds of suppression, while advanced WHO clinical stage (Stage 4) and male sex were associated with unsuppressed viral load. Individual-level predictions were visualized using waterfall plots to enhance clinical interpretability. Conclusion An interpretable Gradient Boosting model can reliably predict viral load suppression using routinely collected clinical data in resource-limited settings. The model’s predictions align with established clinical knowledge, offering a potential decision-support tool for identifying patients at risk of treatment failure at this single site, pending external validation in other cohorts and settings.
Mengistu et al. (Sun,) studied this question.