As a critical factor in diagnostic work-up and treatment decision-making process of sleep-related breathing disorders, accurate localization of obstructive sites in the upper airway is in dire need. Snoring, as a dynamic acoustic signal, carries informative information relating to the sites and degree of obstruction in the upper airway, offering a non-invasive, cost-effective solution for obstructive sites recognition. However, most of existing snoring-based methods for recognizing obstructive sites only involve limited information (either mainly concentrated on traditional acoustic characteristics or spectrogram features), which may omit dynamic pathological information. Moreover, existing methods proceed from either a one-dimensional (1D) signal or two-dimensional (2D) image perspective, where complementary information from the other modality may be overlooked. In this paper, a multi-modal framework, which combines 1D snoring waveform and 2D Composite Acoustic Feature Graph (CAF-Graph), is proposed. 1D snoring waveform perceives fine time structure and local patterns, aiming at learning high-level discriminative representations by neural networks. 2D CAF-Graph is dedicated to emphasizing dynamic spatio-temporal and physiological-acoustic characteristic of snoring, which concatenates acoustic features related to Prosodic, Formant, Spectral, and Cepstral characteristics. Further, a multi-modal fusion network (BMFNet) effectively integrates independent and interactive information between single-modal features, which offers a more comprehensive perspective. The recognition task was formulated as a three-class classification problem, including upper (snoring caused by upper-level obstruction), lower (snoring caused by lower-level obstruction), and silence (obstruction without snoring). The proposed method was validated on a clinical dataset collected in the ENT institute and Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, where reached 81.2% Accuracy, 86.8% Weighted Average Precision, 81.2% Weighted Average Recall, and 82.3% Weighted Average F1-Score. Results exhibit the effectiveness of multi-modal feature representations for snoring, providing a novel insight for obstructive sites recognition tasks.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hu et al. (Thu,) studied this question.
synapsesocial.com/papers/69ada8c2bc08abd80d5bc02e — DOI: https://doi.org/10.1109/jbhi.2026.3670208
Xia Hu
Fudan University
Rui Fang
Huazhong Agricultural University
HUIPING LUO
IEEE Journal of Biomedical and Health Informatics
UNSW Sydney
Fudan University
Eye & ENT Hospital of Fudan University
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: