Abstract Introduction Cortical arousals provide insights into sleep pathologies and sleep quality through relationships to daytime impairment, respiratory event related arousals (RERAs), and hyperarousal patterns in patients with various sleep and psychiatric conditions. Scoring arousals from in-lab polysomnography (PSG) by expert readers is the gold standard, but is burdensome, labor-intensive, and subject to significant inter/intra-rater variability. Here, we compare the detection of arousals from PSG to Waveband 1, an FDA-cleared, at-home, dry-electrode EEG device that captures cortical arousals and other sleep metrics. Methods Waveband and PSG were simultaneously recorded from 36 individuals (ages 25-65, 69% female) overnight. Arousals were scored by 5 Registered PSG Technologists (RPSGT) following AASM guidelines using PSG signals, only to serve as ground-truth for algorithm evaluation. Arousals on Waveband were detected by a machine-learning algorithm trained on over 15k nights. Performance was evaluated by computing the Intraclass correlation coefficient (ICC) of the model’s Arousal index (ArI, arousals per hour sleep) to the scorers’ ArI, as well as by comparing 30-second epochs with or without arousals to the scorers’ majority vote to compute Cohen’s Kappa. Results The mean ArI ICC between pairs of human scorers was 0.74 (SD=0.107, range: 0.58-0.88). The mean ICC between Waveband’s ArI and individual scorers was 0.89 (SD=0.039, range: 0.74-0.85). Comparing the Waveband ArI to the average ArI of all scorers resulted in an ICC of 0.89 (95% CI: 0.84-0.94). The 95% bootstrapped confidence interval of the difference of the average Waveband to the average scorer ArI ICC resulted in a lower bound of 0.006, demonstrating superiority of Waveband to score ArI over the average expert. Waveband’s epoch-level kappa of 0.59 (95% CI: 0.55-0.62) exceeded most experts' agreement (0.44 to 0.61, based on other raters' majority vote). Conclusion An at-home sleep EEG device with an associated machine learning algorithm is able to detect arousals with a high epoch-level agreement and provides an estimate of arousal index with superior reliability than the human experts in our study. These findings demonstrate the potential for scalable, objective quantification of sleep fragmentation in real-world settings, possibly enabling more efficient evaluation of therapies and future diagnostic applications. 1 https://beacon.bio/products-services Support (if any)
Fürbass et al. (Fri,) studied this question.