An EHR-based prediction model predicted 1-year risk of incident T2DM with an AUC of 0.883 (95% CI 0.880-0.886) in the validation cohort.
Cohort (n=3,365,464)
An EHR-based machine-learning model demonstrated excellent discrimination and calibration for predicting 1-, 3-, and 10-year risk of incident T2DM in a large real-world cohort.
Estimación del efecto: AUC 0.883 (95% CI 0.880-0.886)
Introduction and Objective: Over 60% of U.S. adults have risk factors for T2DM, complicating scale-up and sustainability of evidence-based prevention efforts. We developed an EHR-based T2DM prediction model to facilitate real-world implementation. Methods: We conducted a retrospective cohort study among adults aged 18-70 years receiving care at Kaiser Permanente Northern California from 2012-2024, followed until T2DM onset, death, disenrollment, or Dec 31, 2024. The cohort (N=3,365,464) was randomly split 70:30 for training and validation. We applied a hazard-based Super Learning approach to predict 1-, 3-, and 10-year risk. Incident T2DM was defined using diagnosis codes, glycemic test values, or T2DM medication fills; adults prescribed only metformin, SGLT2, or GLP-1s without a diagnosis code, lab or another T2DM medication were not classified as T2DM. Predictors included demographics, clinical measures, lifestyle factors, comorbidities, prescriptions, utilization, and novel predictors (MASLD and neighborhood-level measures of SES, walkability, and food environment). Results: Median age was 39 years (IQR: 28-53), and 55% were female. During a median follow-up of 5.4 years, T2DM incidence was 10.7/1,000 person-years. Within 1-year follow-up, the predictive model achieved an AUC of 0.886 (95% CI: 0.883-0.888) in training and 0.883 (95% CI: 0.880-0.886) in validation, with near-ideal calibration (mean predicted risk 1.03% vs observed 1.01%; slope 1.26). At the optimal cut-point (1.2% risk) identifying the top two deciles of high risk, sensitivity was 80%, specificity 81%, and number needed to evaluate 25. Results were consistent for 3-, and 10-year follow-up. Conclusion: This EHR-based prediction model, developed and validated in over 3 million adults, demonstrated excellent discrimination and calibration. It can support clinicians in identifying patients for T2DM prevention programs, pharmacologic interventions, and enable efficient recruitment for intervention studies. Current work focuses on external validation and future integration into clinical workflows. Disclosure L.A. Rodriguez: None. M.M. Yassin: None. R. Neugebauer: None. T.R. Levin: Research Support; Current; Freenome, Inc. Advisory Panel; Current; Geneoscopy, Navatar. A. Gopalan: None. V. Saxena: None. J. An: Research Support; Current; Bayer AG, AstraZeneca. Research Support; Ended; Merck Current; Gilead Sciences, Inc. Funding National Institute of Diabetes and Digestive and Kidney Diseases (5K01DK138122 and 1P30DK092924).
RODRIGUEZ et al. (Fri,) conducted a cohort in Type 2 Diabetes Mellitus (n=3,365,464). EHR-based T2DM prediction model was evaluated on 1-year risk of incident T2DM (AUC 0.883, 95% CI 0.880-0.886). An EHR-based prediction model predicted 1-year risk of incident T2DM with an AUC of 0.883 (95% CI 0.880-0.886) in the validation cohort.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: