Abstract Rationale With the increasing demand for pediatric sleep testing, home Level III sleep studies are being used more frequently; however, their accuracy and role in children remain uncertain. These devices use built-in algorithms to automatically score respiratory events, potentially streamlining workflow, yet validation against manual scoring by trained technologists is required. This study evaluated the agreement between auto-scored and manually scored indices in obese adolescents undergoing home sleep testing. Methods This secondary analysis is part of The Diagnostic Accuracy of an Ambulatory Level III Sleep Study for the Screening of Sleep-Disordered Breathing in Obese Adolescents, a protocol comparing home Level III testing with the gold-standard in-laboratory polysomnography (PSG). Participants aged 10-18 years with obesity, defined as body mass index (BMI) above the 95th percentile for age and sex were included. Exclusion criteria were the presence of genetic syndromes or the need for oxygen or positive airway pressure therapy. Home studies were performed using the Nox T3 portable monitor. Each recording was first auto-scored by Noxturnal software, then independently rescored manually by a registered polysomnographic technologist and a sleep physician following American Academy of Sleep Medicine (AASM) criteria. The obstructive apnea-hypopnea index (OAHI) and central apnea-hypopnea index (CAHI) obtained from auto and manual scoring were compared using the Wilcoxon signed-rank test for medians, Spearman’s coefficients for correlation, intraclass correlation coefficient (ICC) for reliability, and Bland-Altman analysis for agreement. Results Thirty-one obese adolescents (median age 12 years, IQR 10-16; 52% male) underwent home Level III testing. Median (IQR) values for manual versus automated scoring were: OAHI 2.8 (0.35-6.7) vs 1.1 (0.45-1.9), p = 0.004; and CAHI 0.3 (0.00-0.9) vs 0.0 (0.00-0.25), p = 0.001. Bland-Altman plots showed systematic underestimation by the automated algorithm, with OAHI bias −1.9 events/h (limits of agreement −12.0 to + 8.2) and CAHI bias −1.1 events/h (−9.5 to + 7.3). The ICC for total AHI was 0.98, indicating excellent reliability. Conclusions Automated scoring of home Level III sleep studies in obese adolescents showed strong correlation with manual scoring but systematically underestimated both obstructive and central events. Although both methods followed similar trends, the observed bias and variability indicate that the automated algorithm cannot be considered interchangeable with manual scoring. Manual review remains essential for accurate diagnostic interpretation, particularly in cases with higher severity or borderline findings. This abstract is funded by: None
Escobar et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: