A Random Forest model using 64 clinical features achieved an AUROC of approximately 0.75 for 6-month MACE prediction in patient-level analysis among hemodialysis patients.
Cohort (n=1,645)
Sí
Can machine learning models accurately predict incident major adverse cardiovascular events within 3, 6, and 12 months in patients undergoing maintenance hemodialysis?
A Random Forest machine learning model using routinely collected clinical variables can effectively predict 6- to 12-month major adverse cardiovascular events in hemodialysis patients.
Estimación del efecto: AUROC ≈0.75
Background: Cardiovascular disease (CVD) remains the leading cause of morbidity and mortality among patients undergoing maintenance hemodialysis (HD). Major adverse cardiovascular events (MACE) occur at disproportionately high rates in this population. Traditional statistical models have limited predictive ability in HD patients because they fail to capture the complex, nonlinear, and time-dependent interactions among clinical and dialysis-related factors. Machine learning (ML) algorithms may overcome these limitations and enable individualized risk prediction. Methods: We retrospectively analyzed 1645 HD patients (28 788 sessions) from the Taiwan Society of Nephrology Kidney Dialysis Therapy Database between 2014 and 2023. The primary outcome was incident MACE within 3, 6, and 12 months after each dialysis session. Four ML models – Extreme Gradient Boosting (XGBoost), Random Forest (RF), logistic regression, and k-nearest neighbors (KNN) – were trained using 64 clinical features and progressively simplified to 20 and 15 variables. To address the longitudinal data structure, model performance was evaluated using both session-level random splitting and patient-level splitting, with the latter reflecting generalizability to unseen patients. Discrimination was assessed using the area under the receiver operating characteristic curve (AUROC), and model interpretability was examined using Shapley Additive exPlanations. Results: Tree-based models (RF and XGBoost) consistently outperformed logistic regression and KNN across all time horizons. In patient-level analysis, the RF model using 64 features achieved robust performance for 6-month MACE prediction (AUROC ≈0.75), supporting generalizability across individuals. In contrast, session-level analysis demonstrated substantially higher discrimination (e.g., AUROC up to 0.97 at 12 months), reflecting enhanced within-patient risk stratification incorporating time-varying clinical signals. Key predictors included age, cardiothoracic ratio, glucose, albumin, ferritin, transferrin saturation, and alkaline phosphatase. Predictive performance declined at shorter prediction horizons, likely due to event sparsity and class imbalance. Conclusions: An RF model using routinely collected clinical variables enables both generalizable patient-level cardiovascular risk prediction and refined session-level risk stratification in HD patients. While session-level analysis provides high-resolution, real-time discrimination, patient-level performance offers a more conservative and clinically relevant estimate of model generalizability. These findings support the potential role of ML in dialysis cardiovascular risk assessment, although further prospective validation is required before clinical implementation.
Wang et al. (Thu,) conducted a cohort in Maintenance hemodialysis (n=1,645). Machine learning models (Random Forest and XGBoost) vs. Logistic regression and k-nearest neighbors was evaluated on Incident MACE within 3, 6, and 12 months after each dialysis session (AUROC ≈0.75). A Random Forest model using 64 clinical features achieved an AUROC of approximately 0.75 for 6-month MACE prediction in patient-level analysis among hemodialysis patients.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: