Abstract Rationale Drug-induced sleep endoscopy (DISE) provides direct visualization of upper airway collapse in obstructive sleep apnea (OSA), but it is invasive, costly, and not widely available. Snoring sounds carry physiologic information related to the site of collapse, yet current approaches rarely leverage this potential. We hypothesize that machine learning can automatically learn acoustic signatures of site-specific airway collapse from snoring recorded during DISE, enabling a non-invasive and objective assessment of obstruction sites. Methods We conducted a prospective analysis using a dataset of snoring recordings from 485 patients who underwent DISE as part of clinical evaluation for OSA. Audio was recorded using a calibrated omnidirectional microphone synchronized with endoscopic video and physiologic signals. We manually annotated the start and stop times of over 17,000 individual snores through auditory and visual inspection in Praat. From each segment, we extracted 96 acoustic features encompassing temporal (RMS energy, zero-crossing rate, skewness), spectral (centroid, bandwidth, entropy), formant and pitch-based features, wavelet energy coefficients, and Mel-frequency cepstral coefficients (MFCCs).To capture site-specific collapse patterns, we developed independent binary classifiers for four DISE-verified sites: concentric palatal collapse (CCC), lateral wall (LW), tongue base (TB), and epiglottic (E). Each site-specific classifier contrasted positive subjects against a balanced randomly selected control group that excluded overlapping collapse sites. Strict subject-level separation between training and testing ensured no data leakage. Nine algorithms were evaluated—logistic regression (LR), random forest (RF), decision tree (DT), naïve Bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM), LightGBM, XGBoost, and CatBoost—with each experiment repeated 10 times using randomized sampling to test robustness and reproducibility. Results Across all sites, the mean unweighted average recall (UAR) ranged from 0.55 ± 0.04 to 0.66 ± 0.09 (Figure 1). The best performance was achieved by NB for CCC (UAR = 0.657 ± 0.089), followed by LR for TB (0.609 ± 0.027), LR for LW (0.556 ± 0.036), and KNN for epiglottic collapse (0.578 ± 0.075). Conclusions These findings demonstrate the feasibility of identifying collapse sites directly from snoring acoustics. This abstract is funded by: NIH, Grant No - 5R01HL128658
Building similarity graph...
Analyzing shared references across papers
Loading...
S Saha
Meharry Medical College
K White
Meharry Medical College
H Kotun
Meharry Medical College
American Journal of Respiratory and Critical Care Medicine
Brigham and Women's Hospital
Meharry Medical College
Building similarity graph...
Analyzing shared references across papers
Loading...
Saha et al. (Fri,) studied this question.
synapsesocial.com/papers/6a0d4f34f03e14405aa9a775 — DOI: https://doi.org/10.1093/ajrccm/aamag162.6337