The XGBoost machine learning model incorporating clinical variables, cardiac biomarkers, and echocardiographic parameters achieved an AUC of 0.906 for predicting stroke in NVAF patients with CHA2DS2-VA scores ≤ 1.
Case-Control (n=246)
No
Do machine learning models incorporating cardiac biomarkers and echocardiographic parameters improve stroke prediction in NVAF patients with CHA2DS2-VA scores ≤1?
Machine learning models incorporating cardiac biomarkers (NT-proBNP) and echocardiographic parameters (E/e' ratio, LVEF) accurately predict stroke risk in NVAF patients with low-to-moderate clinical risk scores.
Effect estimate: AUC 0.906 (95% CI 0.826-0.985)
Background Non-valvular atrial fibrillation (NVAF) patients with CHA₂DS₂-VA ≤1 face uncertainty in stroke risk assessment, particularly in Asian populations. Machine-learning (ML) models offer improved accuracy for individualized risk prediction. Methods This single-center, retrospective study at Zhongshan Hospital, Xiamen University (January 2022–January 2025) included 82 NVAF-related stroke cases and 164 matched non-stroke controls with CHA₂DS₂-VA ≤1. Data encompassed demographics, comorbidities, laboratory markers, and echocardiographic parameters. Following stratified train-test split (80%:20%), feature selection used univariable logistic regression and least absolute shrinkage and selection operator (LASSO) regression, retaining the intersection of variables selected by both methods. Three nested predictor sets were defined: Model A (clinical and routine laboratory variables), Model B (Model A plus Cardiac biomarkers), and Model C (Model B plus echocardiographic parameters). ML algorithms logistic regression (LR), random forest (RF), XGBoost (XGB) underwent nested cross-validation for hyperparameter tuning. Performance was evaluated by area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision, and F1 score. Shapley additive explanations (SHAP) were applied to the best-performing ML algorithm in the test set to evaluate the contributions of individual features. Results Stroke cases were older, with higher E/e′ ratio, and elevated N-terminal pro-B-type natriuretic peptide (NT-proBNP), C-reactive protein (CRP), and white blood cell count (all P ≤ 0.01). Comorbidities such as heart failure, hypertension, and age 65–74 years were more prevalent in stroke cases (all P ≤ 0.05). Feature selection yielded seven predictors: age, CRP, E/e′ ratio, left ventricular ejection fraction (LVEF), NT-proBNP, triglycerides, and white blood cell count. In the training set, XGB employing Model C achieved an AUC of 0.905 (95% CI: 0.877–0.933). In the test set, XGB employing Model C yielded the AUC (0.906; 95% CI: 0.826–0.985). SHAP analysis identified NT-proBNP as the most influential feature, with elevated NT-proBNP and E/e′ levels associated with increased predicted risk and higher LVEF linked to decreased risk. Conclusions ML models incorporating cardiac biomarkers and echocardiographic parameters improve stroke risk stratification in low- to moderate-risk NVAF patients, supporting personalized anticoagulation strategies.
Gao et al. (Tue,) conducted a case-control in Non-valvular atrial fibrillation (NVAF) with CHA2DS2-VA scores ≤ 1 (n=246). XGBoost machine learning model (Model C) was evaluated on Area under the receiver operating characteristic curve (AUC) for stroke prediction in the test set (AUC 0.906, 95% CI 0.826-0.985). The XGBoost machine learning model incorporating clinical variables, cardiac biomarkers, and echocardiographic parameters achieved an AUC of 0.906 for predicting stroke in NVAF patients with CHA2DS2-VA scores ≤ 1.