Abstract Introduction Disturbed sleep is a hallmark of many neuropsychiatric conditions, yet the gold standard for assessment—in-clinic polysomnography (PSG) with staging by expert technologists—is costly, time-intensive, subject to inter-rater variability, and burdensome to patients. In this study, we evaluate the performance of Waveband, an FDA-cleared, at-home, dry-electrode EEG headband that captures sleep physiology and includes a newly updated automated sleep-staging algorithm implemented under an FDA-authorized Predetermined Change Control Plan (PCCP). Methods Overnight EEG headband and PSG recordings were simultaneously acquired from 38 subjects (ages 23–66; 50% female) in the OCTAVE-3 study (NCT05438017) . Each 30-second epoch was scored by six registered PSG technologists following AASM guidelines, whereas the EEG headband data were scored using the updated sleep staging algorithm. To compare human and algorithm performance, each individual expert and the updated EEG headband algorithm were independently evaluated against the weighted majority consensus of the remaining five experts. Epoch-level multi-stage and per-stage agreements were computed for each comparison. Results When each expert was evaluated against the consensus of the remaining raters, their overall agreements ranged from 77.6 ± 7.3% to 88.7 ± 4.7% (mean ± SD across recordings), whereas the EEG headband achieved 87.8 ± 4.1% to 89.1 ± 3.7%. The EEG headband exhibited statistically greater overall agreement than two experts (p 0.05, one-sided paired Wilcoxon ranked-sign test) and was not significantly different from the remaining four experts (p0.05, two-sided paired Wilcoxon ranked-sign test). Further analyses revealed comparable or improved agreement across all individual sleep stages. In particular, the EEG headband achieved higher agreement than all six experts in N1 and higher agreement than five out of six experts in REM, both of which are typically subject to greater inter-rater variability. Conclusion The Waveband at-home EEG system and its updated sleep staging algorithm achieve performance comparable to or exceeding that of individual human experts scoring PSG. These findings support the feasibility of accurate, scalable sleep assessment outside the laboratory and demonstrate the PCCP as an effective mechanism for continuous AI/ML improvement in FDA-cleared medical devices. Support (if any)
Manley et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: