This study aimed to identify dental pain using machine learning (ML) algorithms in Brazilian adolescents for public health screening purposes. Data from 2 cross-sectional waves of the Brazilian National Survey of School Health (PeNSE) in 2015 and 2019 were used (schoolchildren aged 11 to 18). The outcome was dental pain in the last 6 months. Co-variables were 53 variables, including demographic, socioeconomic, and behavioral characteristics. The 2015 dataset was split (80:20) into training and test sets, while the 2019 dataset was used as a temporal external validation set. Nine ML models were evaluated. A total of 259,833 adolescents (97.0% of the sample) were included. Dental pain prevalence was 19.5% (95% CI, 19.2-19.8). Extra Trees (ET) was the model with the best metrics in the test and external validation sets. ET showed an AUC = 0.64 (95% CI, 0.63-0.65) and a Recall = 0.57 in the test, and AUC = 0.62 (95% CI, 0.62-0.63) and Recall = 0.57 in the external test, indicating a modest ability to discriminate adolescents with dental pain and to identify approximately 57 out of 100 affected individuals. Fairness estimations show lower accuracy for males, but a higher recall for this group. The model shows a higher accuracy for white adolescents but a lower recall for this group. The Shapley values showed that sex, alcohol consumption, and family violence were the most important variables in the algorithm's identification process. This study shows the potential of ML to identify dental pain in adolescents. Modest predictive performance and fairness limitations highlight the need for improvements before widespread adoption.
Chisini et al. (Fri,) studied this question.