Achieving high retention of people living with HIV (PLHIV) in care remains a challenge in Uganda, despite substantial progress towards UNAIDS 95–95-95 targets. The purpose of this study was to apply and compare machine learning and deep learning models using de-identified longitudinal clinical data collected from HIV clinics in Uganda to predict clients at high risk of missing treatment appointments. Predictions were made at the visit level, and a missed appointment was defined as failure to attend a scheduled clinical visit within 28 days of the assigned return data in accordance with the PEPFAR definition of loss to follow-up. We compared the performance of traditional machine learning models (i.e. commonly used non-sequential models such as Decision Tree, Random Forest, AdaBoost, and XGBoost) and the Bidirectional Encoder Representations from Transformers (BERT) model, which is designed to model sequential and contextual information in longitudinal data. Feature importance using the Shapley additive exPlanations method was used to identify the most influential predictors. We also evaluated the impact of various sampling techniques, i.e. undersampling, oversampling, and synthetic minority oversampling, to address class imbalance and improve model performance. Model performance was evaluated using accuracy, precision, recall, F1-score, and the Area Under the Curve-Receiver Operating Characteristic (AUC-ROC) metrics. The study was based on a longitudinal dataset of 66,206 PLHIV who initiated HIV care during 2000–2023 in 86 health facilities. The data comprised 1,479,121 clinical visits, an average of 22 clinical visits per client; 158,266 (10.7%) missed appointments, and 49,588 (74.9%) of clients missed at least one appointment. Median (interquartile range IQR) age was 38.0 29.0–47.0 years, and the majority (n = 43,132, 65.1%) were female. The BERT model achieved the highest performance in predicting missed appointments, with an AUC of 0.96 CI: 0.94–0.97, precision of 89.0% CI: 87.1%–90.9%, recall of 100% CI: 100.0%–100.0%, and an F1-score of 94.0% CI: 92.5%–95.5% for the missed appointment class. In comparison, the best-performing traditional model, XGBoost with undersampling, achieved an AUC of 0.90 CI: 0.87–0.92, precision of 33.2% CI: 30.3%–36.1%, recall of 79.7% CI: 77.2%–82.2%, and an F1-score of 46.8% CI: 43.7%–49.9% for the missed appointment class. Feature importance analysis showed that treatment adherence, visit frequency, treatment duration, and visits on the current regimen were the most influential predictors of appointment interruption. In this retrospective analysis, transformer-based sequential modeling showed promising performance for visit-level prediction of missed HIV appointments compared with the non-sequential models evaluated. These findings should be interpreted cautiously, as further temporal and external validation is required. If validated prospectively, such models may support EMR workflows by identifying clients at higher risk of missed visits, potentially enabling more targeted outreach and retention support in HIV care programs.
Mirugwe et al. (Fri,) studied this question.