Expressive body motion is an important but difficult-to-interpret channel for emotion communication in embodied agents. Using musical performance as a naturalistic multimodal context, we examined how audio and body motion contribute to emotion perception. In a controlled perceptual study, the audiovisual percept was more similar to the audio-only percept than to the visual-only percept, consistent with an audio-dominant pattern of integration, while body motion alone yielded weak and ambiguous judgments, particularly for valence. We found no evidence that audiovisual presentation systematically shifted mean valence or arousal, and the overall emotional-intensity effect was sensitive to statistical assumptions. The clearest robust pattern was that valence judgments were least decisive in the visual-only condition, while exploratory clip-level analyses suggested a possible facilitatory effect of body motion for low-arousal clips. To support this analysis, we present EMOSIC, a synchronized dataset of over 7.5 h of monophonic audio and upper-body motion capture from professional performances on traditional Chinese plucked instruments, which provides a non-Eurocentric resource for future research on multimodal affect perception and emotionally expressive behavior in embodied agents.
Ma et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: