The k-nearest neighbors machine-learning model achieved an AUC of 0.927 for identifying comorbid atrial fibrillation in patients with diabetic kidney disease using routinely collected clinical data.
Cohort (n=787)
Sí
Does an interpretable machine-learning model accurately identify comorbid atrial fibrillation in patients with diabetic kidney disease?
An interpretable machine-learning model using routine clinical and echocardiographic variables demonstrated high accuracy (AUC 0.927) for identifying comorbid atrial fibrillation in patients with diabetic kidney disease.
Background Patients with diabetic kidney disease (DKD) are at increased risk of atrial fibrillation (AF), yet tools to support identification of comorbid AF in this high-burden population remain limited. We aimed to develop and internally validate an interpretable machine-learning (ML) model for identifying concomitant AF in patients with DKD using routinely collected clinical data. Methods In this retrospective two-center cohort study (January 2021 to October 2025), 787 unique records of patients with DKD were randomly divided into training (70%) and test (30%) sets. AF status was defined as documented atrial fibrillation coexisting with DKD and was ascertained using electrocardiograms, Holter monitoring when available, and ICD-10 diagnostic codes with physician adjudication. Candidate predictors were routine clinical, laboratory, and echocardiographic variables. Least absolute shrinkage and selection operator (LASSO) regression was used for feature selection in the training set. Seven supervised models were trained; performance was assessed by area under the receiver-operating characteristic curve (AUC), calibration, and decision-curve analysis. SHAP quantified feature contributions. Results LASSO retained 14 features, including 24-hour urine total protein (24UTP), serum creatinine (SCr), age, and atrial dimensions. In the test set, the k-nearest neighbors (KNN) model achieved an AUC of 0.927, with an accuracy of 0.886, sensitivity of 0.920, and specificity of 0.856. Calibration was satisfactory, and decision-curve analysis showed net benefit across commonly used thresholds. Five-fold cross-validation yielded mean AUC 0.90 ± 0.02. SHAP analysis identified proteinuria burden, renal dysfunction, age, and atrial size as major contributors to model output. The final model was translated into a preliminary web-based calculator based on routinely available inputs. Conclusions An interpretable ML model incorporating routinely collected clinical and echocardiographic variables showed stable internal performance for identifying comorbid atrial fibrillation in patients with DKD. Because the model is intended to identify concomitant AF status rather than predict incident AF and has undergone internal validation only, further external and prospective validation is required before broader clinical application.
Li et al. (Tue,) conducted a cohort in Diabetic kidney disease with comorbid atrial fibrillation (n=787). K-nearest neighbors (KNN) machine-learning model was evaluated on Area under the receiver-operating characteristic curve (AUC) for identifying comorbid atrial fibrillation in the test set. The k-nearest neighbors machine-learning model achieved an AUC of 0.927 for identifying comorbid atrial fibrillation in patients with diabetic kidney disease using routinely collected clinical data.