What question did this study set out to answer?

To evaluate the performance of pediatric-trained machine learning sleep-wake classifiers and compare them with adult-trained classifiers.

May 10, 2026

0340 The Sleep-Wake Classification Performance of Pediatric-Trained Machine Learning Algorithms for Actigraphy Data

Key Points

To evaluate the performance of pediatric-trained machine learning sleep-wake classifiers and compare them with adult-trained classifiers.
Sixty children underwent overnight polysomnography and wore a GENEActiv device for actigraphy.
Acceleration data were analyzed using six machine-learning models with leave-one-subject-out cross-validation.
Performance metrics included sensitivity, specificity, and balanced accuracy, along with discrepancy analyses of sleep duration.
Top classifiers included random forest (BA=0.78, Sensitivity=0.87, Specificity=0.69) and neural network (BA=0.77, Sensitivity=0.83, Specificity=0.73).
Pediatric classifiers outperformed an adult-trained neural net (BA=0.71, Sensitivity=0.93, Specificity=0.49).
Both GGIR-CK and GGIR-vH algorithms showed comparable performance but underestimated sleep duration when compared to the random forest classifier.

Abstract

Abstract Introduction Increasingly, actigraphy methods seek to leverage raw acceleration data and machine-learning scoring classification. However, much of the progress has been made in adults. We therefore trained machine-learning sleep-wake classifiers using pediatric data. We aimed to assess their sleep-wake scoring ability and benchmarked against an adult-trained classifier and algorithms in GGIR. Methods Sixty children (26 female, ages 5.3–17.7 years) completed in-lab overnight polysomnography at the Children’s Hospital of Philadelphia and wore a GENEActiv device (3-axis accelerometer, 50 Hz) on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Six machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics: sensitivity and specificity, with balanced accuracy (BA) used to rank. Discrepancy analyses compared the overall sleep duration estimated. Results Overall, 560.1 hours of data were collected; 74.4% of epochs were scored as sleep with an average sleep duration of 7.1 hours (SD = 1.9). Of the six pediatric-trained machine learning models, the top two were random forest (BA = 0.78; Sensitivity = 0.87; Specificity = 0.69) and neural network (BA = 0.77; Sensitivity = 0.83; Specificity = 0.73). These performance metrics exceeded that of an adult-trained neural net classifier applied to our data (BA = 0.71; Sensitivity = 0.93; Specificity = 0.49), but were comparable to the GGIR Cole-Kripke (GGIR-CK: BA = 0.79; Sensitivity = 0.75; Specificity = 0.85) and GGIR van Hees algorithms (GGIR-vH: BA = 0.78; Sensitivity = 0.84; Specificity = 0.73). Overall, sleep duration was underestimated by an average of 15 minutes using the random forest classifier and by an average of 37 minutes using the neural network classifier. For comparison, both GGIR-CK and GGIR-vH underestimated sleep duration by an average of 33 minutes. Conclusion We trained pediatric sleep-wake classifiers that had a strong ability to detect sleep and a moderate-to-strong ability to detect wake. Based on epoch-by-epoch and discrepancy analyses, the random forest classifier was the most optimal, outperforming GGIR-CK, GGIR-vH, and an adult-trained neural network classifier. With larger samples used for training and validation, we may reduce variability and further improve pediatric sleep-wake classification using actigraphy. Support (if any)

Bookmark

Cite This Study

Chen et al. (Fri,) studied this question.

synapsesocial.com/papers/6a002222c8f74e3340f9d146 https://doi.org/https://doi.org/10.1093/sleep/zsag091.0340

Bookmark