March 24, 2005

Speech recognition using an auditory model with pitch-synchronous analysis

Key Points

Key points are not available for this paper at this time.

Abstract

An auditory model with two-tone suppression has previously been shown to perform better in speech recognition experiments than a conventional filterbank representation, particularly with noisy or distorted speech. It was, however, known to have several defects including an uneven response across the spectrum and a tendency to detect harmonics of F 0 rather than F 1 . We show that instants of glottal excitation can be derived from the model even with noisy speech. By using this information to carry out pitch-synchronous analysis in a slightly modified model the problem of interaction with harmonics of F 0 can be solved. An analysis of the behavior of the model leads to a specification of a class of processes showing two-tone suppression and hence to a redesigned model avoiding the known defects. The pitch-synchronous analysis is then no longer necessary, but the robust indication of excitation points may have other uses. Spectrograms from the old and new models illustrate the improvements obtained.

Mark Helpful

Bookmark

Relay