An automated AI-based ECG sleep staging algorithm demonstrated substantial agreement with manual scoring, achieving an overall percent agreement of 77.13% and a Cohen's kappa of 0.67.
Observational (n=30)
No
Does an automated AI-based ECG sleep staging algorithm accurately agree with manual scoring in patients with suspected sleep disorders?
An automated AI-based ECG sleep staging algorithm showed substantial agreement with manual scoring, highlighting its potential as a scalable tool for sleep architecture assessment.
Effect estimate: Cohen's kappa 0.67 (95% CI 0.62-0.71)
Abstract Introduction AI-based ECG-only sleep staging could be essential for diagnosing sleep disorders and expanding access to automated phenotyping. This would enable versatile analysis of polysomnography recordings, including applications in cardiometabolic and mental health. Cardiorespiratory coupling suggests that sleep stages can be inferred through autonomic changes. This study assesses the consistency between an automated AI-based ECG-based sleep staging algorithm and manual scoring by sleep technicians in a clinical population. Methods We analyzed data from the first 30 participants enrolled at the VA Greater Los Angeles Healthcare System for evaluation of suspected sleep disorders. The ECG-based algorithm generated 30-second epoch classifications for Wake, N1, N2, N3, and REM using lead-II ECG features. Human scoring performed by clinical sleep technicians served as the reference standard. Agreement between automated and manual scoring was evaluated using overall percent agreement (OPA) and Cohen’s kappa. For each sleep stage, we additionally calculated positive percent agreement (PPA), negative percent agreement (NPA), and positive predictive value (PPV). To obtain point estimates and 95% confidence intervals, we applied a bootstrap procedure with 10,000 resamples at the EDF-file level, computing all metrics (OPA, Cohen’s kappa, PPA, NPA, and PPV) using this resampling framework. Results A total of 24,949 epochs were analyzed. The OPA between the ECG-based algorithm and human scorers was 77.13% (95% CI: 73.81%-79.95%), with a Cohen’s kappa of 0.67 (95% CI: 0.62-0.71), indicating substantial agreement. Stage-specific results for ECG-based staging showed high PPA for Wake (91.00%), N2 (83.49%), and REM (81.40%), moderate PPA for N3 (51.39%), and low PPA for N1 (27.13%). NPA values were high across stages, and PPV was strongest for Wake and REM. Conclusion The automated AI-based ECG staging demonstrated robust agreement with traditional manual EEG scoring, particularly for accurately identifying Wake, N2, and REM sleep. While N3 detection requires further refinement, these results highlight the strong potential of ECG as a simple, scalable, and standalone tool for assessing sleep architecture in clinical settings. Support (if any) Medibio Limited.
Grassi et al. (Fri,) conducted a observational in Suspected sleep disorders (n=30). Automated AI-based ECG-based sleep staging algorithm vs. Manual scoring by clinical sleep technicians was evaluated on Overall percent agreement (OPA) and Cohen's kappa between automated and manual scoring (Cohen's kappa 0.67, 95% CI 0.62-0.71). An automated AI-based ECG sleep staging algorithm demonstrated substantial agreement with manual scoring, achieving an overall percent agreement of 77.13% and a Cohen's kappa of 0.67.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: