This research paper presents parametrization of emotional speech using a pool of common features utilized in emotion recognition such as fundamental frequency, formants, energy, MFCC, PLP, and LPC coefficients. The pool is additionally expanded by perceptual coefficients such as BFCC, HFCC, RPLP, and RASTA PLP, which are used in speech recognition, but not applied in emotion detection. The main contribution of this work is the comparison of the accuracy performance of emotion detection for each feature type based on the results provided by both k-NN and SVM algorithms with 10-fold cross-validation. Analysis was performed on two different Polish emotional speech databases: voice performances by professional actors in comparison with the author’s spontaneous speech.
Kamińska et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: