November 19, 2002

Phrase bigrams for continuous speech recognition

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

In some speech recognition tasks, such as man-machine dialogue systems, the spoken sentences include several recurrent phrases. A bigram language model does not adequately represent these phrases because it underestimates their probability. A better approach consists of modeling phrases as if they were individual dictionary elements. They we inserted as additional entries into the word lexicon, on which bigrams are finally computed. This paper discusses two procedures for automatically determining frequent phrases (within the framework of a probabilistic language model) in an unlabeled training set of written sentences. One procedure is optimal since it minimises the set perplexity. The other, based on information theoretic criteria, insures that the resulting model has a high statistical robustness. The two procedures are tested on a 762-word spontaneous speech recognition task. They give similar results and provide a moderate improvement over standard bigrams.

Me gusta

Guardar

Cite This Study

E. Giachin (Tue,) studied this question.

synapsesocial.com/papers/6a1b4ab539ea7417dc42ab8a https://doi.org/https://doi.org/10.1109/icassp.1995.479405

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar