Key points are not available for this paper at this time.
We present a new, fast method for discrete utterance recognition of telephone bandwidth speech. The method is based on speech coding by vector quantization and minimum cross-entropy pattern classification. Separate vector quantization codebooks are designed from training sequences for each word in the recognition vocabulary. Inputs from outside the training sequence are classified by performing vector quantization and finding the codebook that achieves the lowest average distortion per speech frame. The new method obviates time normalization and uses approximately 6000 bits to represent each utterance in the recognition vocabulary. Preliminary limited testing on speaker dependent digit recognition has demonstrated excellent performance. Detailed tests are now in progress.
Shore et al. (Thu,) studied this question.