What question did this study set out to answer?

The aim is to review and assess the effectiveness of machine learning models in predicting stroke risk for patients with atrial fibrillation.

May 24, 2026

Advancing stroke prevention in atrial fibrillation: a systematic review of machine learning–based risk prediction models

Key Points

The aim is to review and assess the effectiveness of machine learning models in predicting stroke risk for patients with atrial fibrillation.
Conducted a systematic review of existing studies on machine learning-based risk prediction models.

Structured PICO

Do machine learning models derived from EHR data improve the prediction of ischemic stroke compared to the CHA2DS2-VASc score in adults with atrial fibrillation?

Population

809,523 adults with atrial fibrillation from 8 studies across seven countries

Intervention

Machine learning (ML) models derived from electronic health record (EHR) data

Comparator

CHA2DS2-VASc score

Outcome

Predictive performance for ischemic stroke (measured by AUROC)

While machine learning models show potential to improve ischemic stroke prediction in atrial fibrillation compared to CHA2DS2-VASc, current evidence is limited by pervasive methodological flaws and high risk of bias, precluding clinical adoption.

Abstract

BACKGROUND Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia and confers a four to fivefold increase in ischemic stroke risk, accounting for approximately 15 - 20% of all stroke events globally. Despite this burden, the predominant risk stratification tool, the CHA2DS2-VASc score, achieves only modest discrimination, constrained by its static, additive architecture that cannot capture the nonlinear, high-dimensional interactions inherent in real-world electronic health record (EHR) data. This evidence gap creates a dual clinical hazard: under-anticoagulation in high-risk patients and unnecessary bleeding exposure in those whose risk is overestimated. This study aimed to systematically evaluate the predictive performance, methodological rigor, and clinical readiness of machine learning (ML) models derived from EHR data for the prediction of ischemic stroke in patients with AF. METHODS A systematic search of PubMed, Embase, Scopus, and Web of Science was conducted from inception through September 2025, following PRISMA 2020 guidelines. Studies were eligible if they developed or validated ML models for ischemic stroke prediction using EHR data in adults with AF and reported at least one quantitative performance metric. Methodological quality was assessed using the PROBAST and TRIPOD-AI frameworks. RESULTS Eight studies (2017 to 2024) encompassing 809,523 patients across seven countries were included. Supervised ensemble methods consistently outperformed CHA2DS2-VASc, with AUROCs ranging from 0.66 to 0.91 versus 0.54 to 0.68 for the traditional score. However, performance varied substantially: several models achieved only marginal gains (AUROC 0.63 - 0.69), and the AUROC range reflects pronounced heterogeneity rather than uniform superiority. Critical barriers persist - only one study performed external validation; fewer than half applied explainable AI techniques; class imbalance was rarely addressed; and 88% of studies received a high risk of bias rating in the analysis domain under PROBAST, a finding that substantially limits confidence in the reported performance estimates. CONCLUSION In light of the pervasive methodological limitations identified, including high analytic risk of bias, absence of external validation, and lack of model interpretability, claims of ML superiority over CHA2DS2-VASc must be interpreted with caution. While ML models demonstrate potential discriminative improvements, current evidence is insufficient to support clinical adoption. Translating algorithmic promise into bedside impact requires dynamic longitudinal modeling, rigorous multisite external validation, transparent risk attribution, and prospective evaluation within real-world EHR workflows.

Bookmark

Cite This Study

Islam et al. (Fri,) studied this question.

synapsesocial.com/papers/6a129ac348a0ea166567408d https://doi.org/https://doi.org/10.1016/j.ijmedinf.2026.106504

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark