What question did this study set out to answer?

This research aims to improve the prediction of obstructive sleep apnea (OSA) using data from various consumer sleep technologies.

May 10, 2026

0563 Machine Learning-Based Prediction of Sleep Apnea Using Objective Sleep Data from 138 Consumer Sleep Technologies

Key Points

This research aims to improve the prediction of obstructive sleep apnea (OSA) using data from various consumer sleep technologies.
Utilized objective sleep data from 19,431 users via SleepScore mobile app and Apple HealthKit from 138 devices.
Developed a cross-device harmonization model to standardize sleep metrics.
Employed automated machine learning techniques for classifier evaluation and feature engineering.
Best Linear Discriminant Analysis model achieved an AUC of 0.77 with 74% accuracy and 63% sensitivity.
Balanced accuracy across devices ranged from 62%-74% with Kappa coefficients between 0.23-0.34.
Key predictors indicate that fluctuations in sleep stability contribute to OSA risk assessment.

Abstract

Abstract Introduction Obstructive sleep apnea (OSA) remains highly prevalent yet markedly underdiagnosed. Consumer sleep technologies (CSTs) offer scalable access to objective sleep data, but device-specific differences in sensing and proprietary algorithms limit their clinical utility. Most prior work uses single-device datasets. We developed a cross-device harmonization framework to align heterogeneous CST-derived sleep metrics and evaluated its capacity to support scalable, device-agnostic OSA screening. Methods Objective sleep data from users of the SleepScore mobile app were contributed through Apple HealthKit from 138 third-party devices and apps, including most wearables (e.g. Apple Watch, Garmin, Oura, Whoop) and mobile sleep apps. OSA status was established via self-report of clinical diagnosis (13% of users self-reported OSA). The dataset contained 19,431 users (4.3M nights). Users reporting active treatment or comorbidities were excluded. Heterogeneous device-derived sleep metrics were aligned onto a common scale using a cross-device harmonization model, correcting systematic differences across device ecosystems, and producing device-agnostic metrics while preserving physiologically relevant variance. For each user-device combination, 376 engineered features captured nightly distributions, weekday-weekend contrasts, and temporal trends. Automated Machine Learning (PyCaret) compared classifiers using 10-fold cross-validation with 80/20 train/test user-level split to prevent target leakage. The models were optimized for Kappa and were evaluated using AUROC and standard classification metrics. Results A Linear Discriminant Analysis (LDA) model performed best (AUC 0.77, accuracy 74%, sensitivity 63%, specificity 77%, precision 38%, F1-score 0.48, Kappa 0.32). Performance was relatively consistent across the 12 most common devices despite large differences in self-reported prevalence (12–29%), with balanced accuracy ranging from 62%–74%, and Kappa from 0.23–0.34. Key non-demographic predictors reflect variability in sleep patterns, suggesting that fluctuations in sleep stability may play a role in identifying OSA risk. Conclusion These results support the potential of consumer sleep technologies, when combined with cross-device standardization, to improve early OSA identification independent of device type. The findings highlight promising device-agnostic performance while also indicating opportunities for refinement, including prospective PSG validation and improved labeling strategies. Support (if any) Sleep.ai

Bookmark

0563 Machine Learning-Based Prediction of Sleep Apnea Using Objective Sleep Data from 138 Consumer Sleep Technologies

Key Points

Abstract

Cite This Study

Also Consider

Also Consider