Random forest machine learning models utilizing EHR data demonstrated strong discrimination for predicting OSA (ROC-AUC 0.84) and moderate-severe OSA (ROC-AUC 0.82).
Observational (n=285,292)
Do machine learning models utilizing electronic health record data accurately predict obstructive sleep apnea in adults?
EHR-based machine learning models, particularly random forests, demonstrate strong potential for predicting obstructive sleep apnea using cardiometabolic risk features.
Effect estimate: ROC-AUC 0.84
Abstract Introduction Predicting patients at risk of obstructive sleep apnea (OSA) may be incorporated into clinical screening tools. We developed preliminary machine learning (ML) models utilizing electronic health record (EHR) data to predict OSA and moderate-severe OSA. Methods We identified adults who underwent diagnostic sleep studies (2016-2025) with available AHI4% values in Kaiser Permanente Southern California. Baseline characteristics were assessed with the following candidate feature categories: demographics, comorbidities, vitals including anthropomorphic data (average over prior one year), and laboratory values (most recent up to five years). Prediction models were developed for two classification tasks: (1) OSA vs No OSA, and (2) moderate–severe OSA (AHI ≥15) vs No+Mild OSA (AHI 0–14.9). Data were split into 80% training and 20% testing sets. Logistic regression, random forest, and XGBoost machine learning (ML) models were trained and evaluated on the held-out test set. Results 285,292 adults (57% male) were included in the modeling: mean age 50.5±16.7 years, BMI 33.2±8.1 kg/m2, AHI 20.5±22.9. OSA prevalence was high: 27.1% normal (AHI 5), 29.5% mild, 20.0% moderate, 23.3% severe; Race/ethnicity: 38.5% Hispanic, 38.0% White; Asian 9.9%; Black 8.3%, Other/Multiple 5.2%. For predicting OSA (AHI≥5), ML models demonstrated strong discrimination. Random forests achieved the highest overall performance: ROC-AUC 0.84; sensitivity 0.96; specificity 0.43; PPV/NPV 0.82/0.81. Logistic regression showed high sensitivity but lower specificity: AUC 0.76; sensitivity 0.94; specificity 0.34. XGBoost (AUC 0.79) yielded the best rule-in profile, with the highest specificity (0.65) and PPV (0.86), and moderate sensitivity (0.78). For predicting moderate–severe OSA (AHI≥15), performance patterns were similar. Random forests again performed best: ROC-AUC 0.82; sensitivity 0.62; specificity 0.81; PPV/NPV 0.72/0.74. Logistic regression showed good sensitivity but lower specificity (AUC 0.74; sensitivity 0.57; specificity 0.76), while XGBoost (AUC 0.79) provided the highest sensitivity for rule-out classification (0.72) with moderate specificity (0.66). Across both models, OSA and moderate–severe OSA were most strongly predicted by age, male sex, BMI/weight, cardiometabolic comorbidities, and metabolic laboratory biomarkers (e.g., glucose, lipids, HbA1c). Conclusion EHR-based ML models demonstrated strong potential for predicting OSA. Key predictors reflected cardiometabolic risk features. Further refinement, model optimization, and real-world validation are needed. Support (if any) NIH NHLBI R01 HL161253-01A1
Hwang et al. (Fri,) conducted a observational in Obstructive sleep apnea (n=285,292). Machine learning models (Random forest, Logistic regression, XGBoost) was evaluated on Prediction of OSA (AHI≥5) and moderate-severe OSA (AHI≥15) (ROC-AUC 0.84). Random forest machine learning models utilizing EHR data demonstrated strong discrimination for predicting OSA (ROC-AUC 0.84) and moderate-severe OSA (ROC-AUC 0.82).
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: