December 1, 2011

Strategies for training large scale neural network language models

Key Points

Key points are not available for this paper at this time.

Abstract

We describe how to effectively train neural network based language models on large data sets. Fast convergence during training and better overall performance is observed when the training data are sorted by their relevance. We introduce hash-based implementation of a maximum entropy model, that can be trained as a part of the neural network model. This leads to significant reduction of computational complexity. We achieved around 10% relative reduction of word error rate on English Broadcast News speech recognition task, against large 4-gram model trained on 400M tokens.

اسأل الذكاء الاصطناعي

Bookmark

Cite This Study

Mikolov et al. (Thu,) studied this question.

synapsesocial.com/papers/69db1d944a1e15904c836ef5 https://doi.org/https://doi.org/10.1109/asru.2011.6163930

اسأل الذكاء الاصطناعي

Bookmark