Key points are not available for this paper at this time.
Abstract Since emotional speech can be regarded as a variation onneutral (non-emotional) speech, it is expected that a robust neu-tral speech model can be useful in contrasting different emo-tions expressed in speech. This study explores this idea by cre-ating acoustic models trained with spectral features, using theemotionally-neutral TIMIT corpus. The performance is testedwith two emotional speech databases: one recorded with a mi-crophone (acted), and another recorded from a telephone ap-plication (spontaneous). It is found that accuracy up to 78%and 65% can be achieved in the binary and category emotiondiscriminations, respectively. Raw Mel Filter Bank (MFB) out-put was found to perform better than conventional MFCC, withboth broad-band and telephone-band speech. These results sug-gest that well-trained neutral acoustic models can be effectivelyused as a front-end for emotion recognition, and once trainedwith MFB, it may reasonably work well regardless of the chan-nel characteristics.Index Terms: Emotion recognition, Neutral speech, HMMs,Mel filter bank (MFB), TIMIT
Busso et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: