October 3, 1996

A probabilistic framework for feature-based speech recognition

Key Points

Key points are not available for this paper at this time.

Abstract

Most current speech recognizers use an observation space which is based on a temporal sequence of "frames" (e.g., Mel-cepstra).There is another class of recognizer which further processes these frames to produce a segment-based network, and represents each segment by fixed-dimensional "features."In such feature-based recognizers the observation space takes the form of a temporal network of feature vectors, so that a single segmentation of an utterance will use a subset of all possible feature vectors.In this work we examine a maximum a posteriori decoding strategy for feature-based recognizers and develop a normalization criterion useful for a segmentbased Viterbi or A search.We report experimental results for the task of phonetic recognition on the TIMIT corpus where we achieved context-independent and context-dependent (using diphones) results on the core test set of 64.1% and 69.5% respectively.

A probabilistic framework for feature-based speech recognition

Key Points

Abstract

Cite This Study