Key points are not available for this paper at this time.
Recently, deep neural networks (DNN) have been incorporated into i-vector-based speaker recognition systems, where they have significantly improved state-of-the-art performance. In these systems, a DNN is used to collect sufficient statistics for i-vector extraction. In this study, the DNN is a recently developed time delay deep neural network (TDNN) that has achieved promising results in LVCSR tasks. We believe that the TDNN-based system achieves the best reported results on SRE10 and it obtains a 50% relative improvement over our GMM baseline in terms of equal error rate (EER). For some applications, the computational cost of a DNN is high. Therefore, we also investigate a lightweight alternative in which a supervised GMM is derived from the TDNN posteriors. This method maintains the speed of the traditional unsupervised-GMM, but achieves a 20% relative improvement in EER.
Building similarity graph...
Analyzing shared references across papers
Loading...
David Snyder
ECRI Institute
Daniel Garcia-Romero
Johns Hopkins University
Daniel Povey
Xiaomi (China)
Johns Hopkins University
Building similarity graph...
Analyzing shared references across papers
Loading...
Snyder et al. (Tue,) studied this question.
synapsesocial.com/papers/6a1774701723722a886ea655 — DOI: https://doi.org/10.1109/asru.2015.7404779