Los puntos clave no están disponibles para este artículo en este momento.
This study addresses the critical challenge of Speech Enhancement (SE) in noisy environments, where the deployment of Deep Neural Networksolutions on microcontrollers is hindered by their extensive computational demands. Focusing on this gap, our research introduces a novel SE method optimized for MCUs, employing a 2-layer GRU model that capitalizes on perceptual speech properties and innovative training methodologies. By incorporating self-reference signals and a dual strategy of compression and recovery based on the Mel scale, we develop an efficient model tailored for low-latency applications. Our GRU-2L-128 model demonstrates a significant reduction in size and computational requirements, achieving a 14.2× decrease in model size and a 409.1× reduction in operations compared to traditional DNN methods like DCCRN, without compromising performance. This advancement offers a promising solution for enhancing speech intelligibility in resource-constrained devices, marking a pivotal step in SE research and application.
Dai et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: