Abstract Introduction Increasingly, actigraphy methods seek to leverage raw acceleration data and machine-learning scoring classification. However, much of the progress has been made in adults. We therefore trained machine-learning sleep-wake classifiers using pediatric data. We aimed to assess their sleep-wake scoring ability and benchmarked against an adult-trained classifier and algorithms in GGIR. Methods Sixty children (26 female, ages 5.3–17.7 years) completed in-lab overnight polysomnography at the Children’s Hospital of Philadelphia and wore a GENEActiv device (3-axis accelerometer, 50 Hz) on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Six machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics: sensitivity and specificity, with balanced accuracy (BA) used to rank. Discrepancy analyses compared the overall sleep duration estimated. Results Overall, 560.1 hours of data were collected; 74.4% of epochs were scored as sleep with an average sleep duration of 7.1 hours (SD = 1.9). Of the six pediatric-trained machine learning models, the top two were random forest (BA = 0.78; Sensitivity = 0.87; Specificity = 0.69) and neural network (BA = 0.77; Sensitivity = 0.83; Specificity = 0.73). These performance metrics exceeded that of an adult-trained neural net classifier applied to our data (BA = 0.71; Sensitivity = 0.93; Specificity = 0.49), but were comparable to the GGIR Cole-Kripke (GGIR-CK: BA = 0.79; Sensitivity = 0.75; Specificity = 0.85) and GGIR van Hees algorithms (GGIR-vH: BA = 0.78; Sensitivity = 0.84; Specificity = 0.73). Overall, sleep duration was underestimated by an average of 15 minutes using the random forest classifier and by an average of 37 minutes using the neural network classifier. For comparison, both GGIR-CK and GGIR-vH underestimated sleep duration by an average of 33 minutes. Conclusion We trained pediatric sleep-wake classifiers that had a strong ability to detect sleep and a moderate-to-strong ability to detect wake. Based on epoch-by-epoch and discrepancy analyses, the random forest classifier was the most optimal, outperforming GGIR-CK, GGIR-vH, and an adult-trained neural network classifier. With larger samples used for training and validation, we may reduce variability and further improve pediatric sleep-wake classification using actigraphy. Support (if any)
Chen et al. (Fri,) studied this question.