Key points are not available for this paper at this time.
Spoofing detection for automatic speaker verification (ASV) aims to discriminate between genuine and spoofed speech. This topic has received increased attentions recently due to safety concerns with deploying an ASV system. While the performance of spoofing detection has improved significantly in clean condition in recent studies, the performance degrades dramatically in noisy conditions. To address this issue, in this paper, we propose to extract robust and discriminative deep features by using deep learning techniques for spoofing detection. In particular, we employ deep feedforward, recurrent, and convolutional neural networks to extract discriminative features. We also introduce multicondition training, noise-aware training, and annealed dropout training to make neural networks more robust against noise and to avoid overfitting to specific spoofing attacks and noise types. The proposed neural networks and training techniques are combined into a single framework for spoofing detection. Experimental evaluation is carried out on a noisy version of the standard ASVspoof 2015 corpus, including both additive noisy and reverberant scenarios. Experimental results confirm that the proposed system dramatically decreases averaged equal error rates from 19.1% and 22.6% to 3.2% and 5.1% for seen and unseen noisy conditions, respectively.
Qian et al. (Wed,) studied this question.