Key points are not available for this paper at this time.
Emotion recognition from speech plays an important role in developing affective and intelligent systems. This study investigates sentence-level emotion recognition. We propose to use a two-step approach to leverage information from sub sentence segments for sentence level decision. First we use a segment level emotion classifier to generate predictions for segments within a sentence. A second component combines the predictions from these segments to obtain a sentence level decision. We evaluate different segment units (words, phrases, time-based segments) and different decision combination methods (majority vote, average of probabilities, and a Gaussian Mixture Model (GMM)). Our experimental results on two different data sets show that our proposed method significantly outperforms the standard sentence-based classification approach. In addition, we find that using time-based segments achieves the best performance, and thus no speech recognition or alignment is needed when using our method, which is important to develop language independent emotion recognition systems.
Jeon et al. (Sun,) studied this question.
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: