What is the clinical evidence from this study?

Study design: Cohort. Population: Maintenance hemodialysis (n=1645). Intervention: Machine learning models (Random Forest and XGBoost) vs. Logistic regression and k-nearest neighbors. Primary outcome: Incident MACE within 3, 6, and 12 months after each dialysis session (AUROC ≈0.75).

What question did this study set out to answer?

This research aims to develop and validate machine learning models for predicting major adverse cardiovascular events in hemodialysis patients.

June 19, 2026Open Access

Machine-learning approaches for major adverse cardiovascular event prediction in patients on hemodialysis

Resultado clave

A Random Forest model using 64 clinical features achieved an AUROC of approximately 0.75 for 6-month MACE prediction in patient-level analysis among hemodialysis patients.

Puntos clave

This research aims to develop and validate machine learning models for predicting major adverse cardiovascular events in hemodialysis patients.
Retrospective analysis of 1645 hemodialysis patients from 2014 to 2023.
Training of four machine learning models: XGBoost, Random Forest, logistic regression, and KNN using 64 clinical features.
Model performance was evaluated through session-level and patient-level discrimination using AUROC.
Random Forest model achieved AUROC of ≈0.75 for 6-month MACE prediction, indicating good generalizability.
Session-level AUROC was as high as 0.97 at 12 months, showcasing strong within-patient risk stratification.
Key predictors of MACE included age, glucose, albumin, and other clinical markers.

Diseño del estudio

Tipo

Cohort (n=1,645)

Multicéntrico

Sí

PICO estructurado

Can machine learning models accurately predict incident major adverse cardiovascular events within 3, 6, and 12 months in patients undergoing maintenance hemodialysis?

Población

1,645 hemodialysis patients (28,788 sessions) retrospectively analyzed from a national database between 2014 and 2023 to predict incident MACE.

Exposición

Machine learning models (Extreme Gradient Boosting, Random Forest, logistic regression, and k-nearest neighbors) using 64, 20, or 15 clinical features.

Resultado

Incident major adverse cardiovascular events (MACE) within 3, 6, and 12 months after each dialysis session.composite

A Random Forest machine learning model using routinely collected clinical variables can effectively predict 6- to 12-month major adverse cardiovascular events in hemodialysis patients.

Resultado numérico

Estimación del efecto: AUROC ≈0.75

Limitaciones

Further prospective validation is required before clinical implementation.

Resumen

Background: Cardiovascular disease (CVD) remains the leading cause of morbidity and mortality among patients undergoing maintenance hemodialysis (HD). Major adverse cardiovascular events (MACE) occur at disproportionately high rates in this population. Traditional statistical models have limited predictive ability in HD patients because they fail to capture the complex, nonlinear, and time-dependent interactions among clinical and dialysis-related factors. Machine learning (ML) algorithms may overcome these limitations and enable individualized risk prediction. Methods: We retrospectively analyzed 1645 HD patients (28 788 sessions) from the Taiwan Society of Nephrology Kidney Dialysis Therapy Database between 2014 and 2023. The primary outcome was incident MACE within 3, 6, and 12 months after each dialysis session. Four ML models – Extreme Gradient Boosting (XGBoost), Random Forest (RF), logistic regression, and k-nearest neighbors (KNN) – were trained using 64 clinical features and progressively simplified to 20 and 15 variables. To address the longitudinal data structure, model performance was evaluated using both session-level random splitting and patient-level splitting, with the latter reflecting generalizability to unseen patients. Discrimination was assessed using the area under the receiver operating characteristic curve (AUROC), and model interpretability was examined using Shapley Additive exPlanations. Results: Tree-based models (RF and XGBoost) consistently outperformed logistic regression and KNN across all time horizons. In patient-level analysis, the RF model using 64 features achieved robust performance for 6-month MACE prediction (AUROC ≈0.75), supporting generalizability across individuals. In contrast, session-level analysis demonstrated substantially higher discrimination (e.g., AUROC up to 0.97 at 12 months), reflecting enhanced within-patient risk stratification incorporating time-varying clinical signals. Key predictors included age, cardiothoracic ratio, glucose, albumin, ferritin, transferrin saturation, and alkaline phosphatase. Predictive performance declined at shorter prediction horizons, likely due to event sparsity and class imbalance. Conclusions: An RF model using routinely collected clinical variables enables both generalizable patient-level cardiovascular risk prediction and refined session-level risk stratification in HD patients. While session-level analysis provides high-resolution, real-time discrimination, patient-level performance offers a more conservative and clinically relevant estimate of model generalizability. These findings support the potential role of ML in dialysis cardiovascular risk assessment, although further prospective validation is required before clinical implementation.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo