BackgroundEarly diagnosis of dementia is essential for enabling timely interventions that may slow disease progression, improve patient outcomes, and reduce healthcare costs. This study aims to develop machine learning models to predict dementia risk using longitudinal electronic health record (EHR) data.ObjectiveThis research aims to develop and evaluate machine learning models for dementia risk prediction using longitudinal EHR data from routine clinical care and to identify key clinical features associated with elevated dementia risks.MethodsWe conducted an incidence-based case-control study using EHR data from the UMass Memorial Health system (2017-2024) to develop a dementia risk prediction model.ResultsThis study included 5622 dementia cases and 44,976 controls. The XGBoost model achieved the highest AUC (0.802), with top predictors included thyroid-stimulating hormone (TSH), vitamin B12, and HDL cholesterol. Model performance was consistent across sexes and remained robust in multiple sensitivity analyses.ConclusionsMachine learning models that integrate comorbid conditions and longitudinal laboratory test patterns show their potential in predicting dementia risk. These findings highlight the promise of routinely collected EHR data as a scalable, low-cost resource for identifying individuals at elevated risk for dementia.
Ye et al. (Thu,) studied this question.