September 26, 2010

Cross-lingual and multi-stream posterior features for low resource LVCSR systems

Key Points

Key points are not available for this paper at this time.

Abstract

We investigate approaches for large vocabulary continuous speech recognition (LVCSR) system for new languages or new domains using limited amounts of transcribed training data. In these low resource conditions, the performance of conventional LVCSR systems degrade significantly. We propose to train low resource LVCSR system with additional sources of information like annotated data from other languages (German and Spanish) and various acoustic feature streams (short-term and modulation features). We train multilayer perceptrons (MLPs) on these sources of information and use Tandem features derived from the MLPs for low resource LVCSR. In our experiments, the proposed system trained using only one hour of English conversational telephone speech (CTS) provides a relative improvement of 11% over the baseline system. Index Terms: Cross-lingual posterior features, Multi-stream features, Low resource ASR, Tandem features

Cross-lingual and multi-stream posterior features for low resource LVCSR systems

Key Points

Abstract

Cite This Study

Also Consider

Also Consider