Key points are not available for this paper at this time.
In this study, we investigate acoustic properties of speech associ-ated with four different emotions (sadness, anger, happiness, and neutral) intentionally expressed in speech by an actress. The aim is to obtain detailed acoustic knowledge on how speech is modulated when speaker’s emotion changes from neutral to a certain emotional state. It is based on measurements of acoustic parameters related to speech prosody, vowel articulation and spectral energy distribution. Acoustic similarities and differences among the emotions are then explored with mutual information computation, multidimensional scaling, and comparison of acoustic likelihoods relative to the neu-tral emotion. In addition, acoustic separability of the emotions is tested using the discriminant analysis at the utterance level and the result is compared with human evaluation. Results show that hap-piness/anger and neutral/sadness share similar acoustic properties in this speaker. Speech associated with anger and happiness are characterized by longer utterance duration, shorter inter-word si-lence, higher pitch and energy values with wider ranges, showing the characteristics of exaggerated or hyperarticulated speech. The discriminant analysis indicates that within-group acoustic separa-bility is relatively poor, suggesting that conventional acoustic pa-rameters examined in this study are not effective in describing the emotions along the valence (or pleasure) dimension. It is noted that RMS energy, inter-word silence and speaking rate are useful in dis-tinguishing sadness from others. Interestingly, the between-group difference in formant patterns seems better reflected in back vowels such as /a / (/father/) than in the front vowels. Larger lip opening and/or more tongue constriction at the mid or rear part of the vocal tract could be underlying reasons. 1.
Yıldırım et al. (Mon,) studied this question.