Key points are not available for this paper at this time.
Verbmobil, a German research project, aims at machine translation of spontaneous speech input. The ultimate goal is the development of a portable machine translator that will allow people to negotiate in their native language. Within this project the University of Karlsruhe has developed a speech recognition engine that has been evaluated on a yearly basis during the project and shows very promising speech recognition word accuracy results on large vocabulary spontaneous speech. We introduce the Janus Speech Recognition Toolkit underlying the speech recognizer. The main new contributions to the acoustic modeling part of our 1996 evaluation system-speaker normalization, channel normalization and polyphonic clustering-are discussed and evaluated. Besides the acoustic models we delineate the different language models used in our evaluation system: word trigram models interpolated with class based models and a separate spelling language model were applied. As a result of using the toolkit and integrating all these parts into the recognition engine the word error rate on the German spontaneous scheduling task (GSST) could be decreased from 30% word error rate in 1995 to 13.8% in 1996.
Finke et al. (Fri,) studied this question.