Key points are not available for this paper at this time.
Recently there is growing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models (LM), the neural network approach tries to limit problems from the data sparseness by performing the es-timation in a continuous space, allowing by these means smooth interpolations. Therefore this type of LM is interesting for tasks for which only a very limited amount of in-domain training data is available, such as the modeling of conversational speech. In this paper we analyze the generalization behavior of the neural network LM for in-domain training corpora vary-ing from 7M to over 21M words. In all cases, significant word error reductions were observed compared to a carefully tuned 4-gram backoff language model in a state of the art conversa-tional speech recognizer for the NIST rich transcription evalu-ations. We also apply ensemble learning methods and discuss their connections with LM interpolation. 1.
Schwenk et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: