What type of study is this?

September 10, 2025Open Access

Acoustic Signal Enhancement Using Deep Neural Networks

Key Points

The convolutional encoder-decoder network outperformed traditional methods for enhancing speech signals under low SNR.
Training used STFT magnitude features and evaluated using four objective quality measures, showing clear improvements.
Two different deep learning architectures were applied, yielding significant noise reduction in diverse conditions.
The results emphasize deep neural networks' ability to effectively denoise speech signals in environments with high background noise.

Abstract

The presence of background noise in acoustic signals, such as speech, audio, and sound signals, degrades listening quality and causes hearing fatigue to the listener. Standard methods offer better signal enhancement under high SNR conditions. Deep neural networks employed in image processing and speech recognition have demonstrated significant performance improvements. This motivates the usage of deep neural networks for denoising speech signals corrupted with multiple noises under low SNR conditions (0 dB). This study applied two different types of deep neural networks, convolutional neural networks and deep generative networks, to remove background noise from speech signals under low SNR conditions. The noise reduction networks were trained to estimate the noise signal present, which was then subtracted to obtain the denoised speech signal. Two convolutional neural network architectures, the UNet and the Convolutional Encoder-Decoder network (CED), and two deep generative networks, Vector Quantized Variational Autoencoders (VQVAE) and Variational Autoencoders (VAE), were trained on STFT magnitude features of noisy signal frames. Four objective quality measures were used to determine the quality of the enhanced speech, namely Perceptual Evaluation of Speech Quality (PESQ), Short Time Objective Intelligibility (STOI), Segmental Signal to Noise Ratio (SSNR), and improvement in SNR. Spectral subtraction and logMMSE methods were used to evaluate the performance of these networks in two datasets. The results of the comparative analysis support the superiority of CED for signal denoising and enhancement of speech signals for multiple noises under low SNR conditions, with a much smaller number of model parameters compared to other methods for both seen and unseen noise conditions.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Shibani Kar

Sambalpur University

V. Mukherjee

Sambalpur University

Journals

Engineering Technology & Applied Science Research

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Acoustic Signal Enhancement Using Deep Neural Networks

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study