What type of study is this?

This is a Quantitative Study study.

September 28, 2025Open Access

Parallel Time-Frequency Multi-Scale Attention with Dynamic Convolution for Environmental Sound Classification

Key Points

Achieving a classification accuracy of 90%, PTFMSAN outperforms conventional models for environmental sound classification.
The PTFMSA module integrates local and global attention across different scales to enhance feature representation.
A parallel branch structure helps to extract time and frequency domain features without mutual interference.
Experimental results validated the individual contributions of each component through ablation studies.

Abstract

Convolutional neural network (CNN) models are widely used for environmental sound classification (ESC). However, 2-D convolutions assume translation invariance along both time and frequency axes, while in practice the frequency dimension is not shift-invariant. Additionally, single-scale convolutions limit the receptive field, leading to incomplete feature representation. To address these issues, we introduce a parallel time-frequency multi-scale attention (PTFMSA) module that integrates local and global attention across multiple scales to improve dynamic convolution in order to overcome these problems. We also introduce the parallel branch structure to avoid mutual interference of information in case of extracting time and frequency domain features. Additionally, we utilize learnable parameters that can dynamically adjust the weights of different branches during network training. Building on this module, we develop PTFMSAN, a compact network that processes raw waveforms directly for ESC. To further strengthen learning, between-class (BC) training is applied. Experiments on the ESC-50 dataset show that PTFMSAN outperforms the baseline model, achieving a classification accuracy of 90%, competitive among CNN-based networks. We also performed ablation experiments to verify the effectiveness of each module.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper

Cite This Study

Wan et al. (Fri,) studied this question.

synapsesocial.com/papers/68d9051441e1c178a14f4b25 https://doi.org/https://doi.org/10.3390/e27101007

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

AI에게 질문

Bookmark

View Full Paper