March 18, 2024Open Access

Improving Acoustic Echo Cancellation for Voice Assistants Using Neural Echo Suppression and Multi-Microphone Noise Reduction

Key Points

Key points are not available for this paper at this time.

Abstract

Keyword spotting (KS) and automatic speech recognition (ASR) on smart speakers in a home environment with interfering signals from loudspeakers are challenging tasks to this day, despite improvements in acoustic echo cancellation (AEC) systems. In this work we propose to combine a single microphone AEC system, consisting of an adaptive linear filter (linear AEC) and a neural echo suppressor (NES), with an adaptive filter developed for multi-microphone noise reduction, called Cleaner. This additional enhancement step allows the AEC system to profit from spatial information to remove residual echo. The single microphone NES model improves upon the waveform domain counterpart proposed in 1 using a frequency domain representation that helps with generalization. Furthermore, we show that using multiple linear AEC configurations during model training provides large gains over a fixed configuration. On the hardest considered test condition, the proposed system outperforms the baseline model 1 for single microphone input by 66 % (relative) in KS false reject rate (FRR) and 52 % (relative) in ASR word error rate (WER). Using the multi-microphone setting, the FRR is reduced by an additional 52 % and the WER by an additional 32 %.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper