What question did this study set out to answer?

This research aims to evaluate the effectiveness of different speech descriptors in recognizing emotions from speech signals.

February 20, 2017Open Access

Efficiency of chosen speech descriptors in relation to emotion recognition

Key Points

This research aims to evaluate the effectiveness of different speech descriptors in recognizing emotions from speech signals.
Utilized a range of emotional speech features including fundamental frequency and MFCC.
Applied k-NN and SVM algorithms with 10-fold cross-validation for accuracy comparison.
Analyzed data from two Polish emotional speech databases: professional actors' performances and spontaneous speech.
k-NN and SVM algorithms demonstrated varying accuracy depending on the descriptor used for emotion recognition.
Performance metrics indicated that MFCC features yielded the highest accuracy in emotion detection.
Perceptual coefficients improved recognition rates compared to traditional features.

Abstract

This research paper presents parametrization of emotional speech using a pool of common features utilized in emotion recognition such as fundamental frequency, formants, energy, MFCC, PLP, and LPC coefficients. The pool is additionally expanded by perceptual coefficients such as BFCC, HFCC, RPLP, and RASTA PLP, which are used in speech recognition, but not applied in emotion detection. The main contribution of this work is the comparison of the accuracy performance of emotion detection for each feature type based on the results provided by both k-NN and SVM algorithms with 10-fold cross-validation. Analysis was performed on two different Polish emotional speech databases: voice performances by professional actors in comparison with the author’s spontaneous speech.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper