Acoustic echo cancellation (AEC) remains a critical challenge in full-duplex communication systems, where acoustic coupling between loudspeakers and microphones significantly degrades speech quality. Conventional adaptive filtering methods, such as the normalized least mean squares (NLMS) and recursive least squares algorithms, offer computational efficiency but suffer from limited convergence rates, sensitivity to nonlinear distortions, and poor adaptability under time-varying acoustic conditions. Deep neural network-based AEC approaches have demonstrated improved residual echo suppression, yet their high computational complexity constrains real-time applicability. This paper proposes a fully convolutional recurrent network-based multiple sub-filter (FCRN-MSF) framework that combines the efficiency of adaptive filtering with the dynamic modelling capability of deep learning, achieving 12-15 dB enhancement in echo return loss enhancement (ERLE) and reduced steady-state mean square error over state-of-the-art baselines. The proposed hybrid architecture employs MSFs to capture multi-path echo characteristics across diverse delay distributions, while an FCRN-based step-size estimator adaptively tunes the learning rate using temporal-spatial correlations derived from bidirectional long short-term memory units and channel-wise attention mechanisms. Extensive evaluations on the ICASSP 2022 AEC challenge and DNS-CHiME3 datasets demonstrate that the proposed method achieves 38-40 dB ERLE at 40 dB SNR (12-15 dB improvement over NLMS baselines), a perceptual evaluation of speech quality scores of 3.80 (0.10-0.15 point improvement), a signal-to-distortion ratio of 38.9 dB, and a 30-40% faster convergence time (1.10 s vs. 1.65 s) compared to traditional AEC algorithms that makes it suitable for real-time deployment in resource-constrained full-duplex communication systems.
Tadi et al. (Sun,) studied this question.