April 1, 2009

Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling

Key Points

Key points are not available for this paper at this time.

Abstract

Acoustic models used in hidden Markov model/neural-network (HMM/NN) speech recognition systems are usually trained with a frame-based cross-entropy error criterion. In contrast, Gaussian mixture HMM systems are discriminatively trained using sequence-based criteria, such as minimum phone error or maximum mutual information, that are more directly related to speech recognition accuracy. This paper demonstrates that neural-network acoustic models can be trained with sequence classification criteria using exactly the same lattice-based methods that have been developed for Gaussian mixture HMMs, and that using a sequence classification criterion in training leads to considerably better performance. A neural network acoustic model with 153K weights trained on 50 hours of broadcast news has a word error rate of 34.0% on the rt04 English broadcast news test set. When this model is trained with the state-level minimum Bayes risk criterion, the rt04 word error rate is 27.7%.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Brian Kingsbury (Wed,) studied this question.

synapsesocial.com/papers/6a0cf02cd24d91c50ccc8d01 — DOI: https://doi.org/10.1109/icassp.2009.4960445

Authors

Brian Kingsbury

IBM (United States)

Actions

Institutions

IBM (United States)

IBM Research - Thomas J. Watson Research Center

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion