COPD remains a prevalent and debilitating respiratory condition, necessitating early and accurate diagnosis for optimal clinical intervention. In this study, we propose a novel deep learning-based diagnostic framework that employs the ECAPA-TDNN (Emphasized Channel Attention, Propagation and Aggregation—Time Delay Neural Network) architecture to classify respiratory sound signals from the ICBHI dataset. Originally designed for speaker verification, ECAPA-TDNN introduces channel attention and multi-scale feature aggregation, which we adapt for the first time to the domain of medical acoustic analysis. This architecture allows the model to effectively capture subtle and discriminative patterns in pathological breathing sounds, overcoming the limitations of conventional CNN-based methods. Our methodology integrates rigorous signal preprocessing, log-Mel spectrogram extraction, and data augmentation to enhance robustness and generalization. An Attentive Statistics Pooling mechanism is employed for temporal feature summarization, while Grad-CAM-based explainability is incorporated to improve the interpretability of the diagnostic predictions. The model is rigorously validated using a five-fold cross-validation scheme, achieving a mean validation accuracy of 96.8% with consistently high F1-scores and recall rates across all folds. Comparative analysis with prior methods highlights the superiority of our ECAPA-TDNN-based approach in terms of diagnostic precision, robustness, and potential clinical applicability. To the best of our knowledge, this is the first work to adapt ECAPA-TDNN for COPD detection from respiratory sounds, establishing a new benchmark in interpretable and high-performance acoustic-based respiratory disease screening.
Xu et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: