OBJECTIVES: To develop and evaluate artificial intelligence (AI) models for detecting and classifying voice disorders using acoustic recordings, aiming to facilitate earlier diagnosis and optimize clinical resource allocation. METHODS: This multicenter predictive modeling study analyzed data from a large cohort of 1948 patients with voice disorders and 665 controls collected at two Belgian hospitals between 2014 and 2025. Acoustic recordings of seven standardized speech tasks were analyzed, using a fixed split of 85% for training with 10-fold stratified cross-validation (CV), while the remaining 15% was reserved as an independent hold-out test set. Two modeling strategies were evaluated: (1) extraction of HuBERT features paired with various classifiers and (2) fine-tuning a pretrained Audio Spectrogram Transformer (AST). Six binary diagnostic classifiers were trained: healthy vs pathological and five one-vs-rest (OvR) classifiers within the pathological cohort (neurological, benign lesion, functional, inflammatory, and tumor). A hierarchical ensemble combined model results, using healthy vs pathological as the primary binary gatekeeper, with secondary OvR models to further classify specific voice disorders. Performance was assessed with AUROC and F1 score as primary metrics. RESULTS: Among 2613 total participants (median age 51 years for patients; 36 for controls), near-perfect detection of healthy vs pathological voices was achieved with an AUROC of 0.993 (95% CI, 0.986-0.996) and F1 score of 0.949. Performance for classifying specific disorder subtypes was lower; the distinction between non-neurological and neurological disorders achieved an AUROC of 0.744. Other binary models utilizing HuBERT features demonstrated modest performance, with AUROCs ranging from 0.669 to 0.764 and F1 scores from 0.447 to 0.680. CONCLUSIONS: While current AI models, particularly AST, demonstrate high diagnostic accuracy in distinguishing pathological from healthy voices, performance in classifying specific disorder subtypes requires further improvement. These findings suggest that AI-driven acoustic analysis has significant potential as a noninvasive screening tool supporting the earlier identification of voice disorders.
Building similarity graph...
Analyzing shared references across papers
Loading...
L. Berteloot
AZ Delta
Fergio Sismono
University of Antwerp
Léonore Maertens
AZ Delta
Journal of Voice
University of Antwerp
AZ Delta
Thomas More University
Building similarity graph...
Analyzing shared references across papers
Loading...
Berteloot et al. (Fri,) studied this question.
synapsesocial.com/papers/6a13e6b30e02ee3982d319a1 — DOI: https://doi.org/10.1016/j.jvoice.2026.04.024