Key points are not available for this paper at this time.
Multi-modal emotion recognition has attracted increasing attention in human-computer interaction, as it extracts complementary information from physiological and behavioral features. Compared to single modal approaches, multi-modal fusion methods are more susceptible to uncertainty in emotion recognition, such as heterogeneity and inconsistent predictions across different modalities. Previous multi-modal approaches ignore systematic modeling of uncertainty in fusion and revelation of dynamic variations in emotion process. In this paper, we propose a dynamic confidence-aware fusion network for robust recognition of heterogeneous emotion features, including electroencephalogram (EEG) and facial expression. First, we develop a self-attention based multi-channel LSTM network to preliminarily align the heterogeneous emotion features. Second, we propose a confidence regression network to estimate true class probability (TCP) on each modality, which helps explore the uncertainty at modality level. Then, different modalities are weighted fused according to above two types of uncertainty. Finally, we adopt self-paced learning (SPL) mechanism to further improve the model robustness by alleviating negative effect from the hard learning samples. The experimental results on several multi-modal emotion datasets demonstrate the proposed method outperforms the state-of-the-art methods in emotion recognition performance and explicitly reveals the dynamic variation of emotion with uncertainty estimation. Our code is available at:
Building similarity graph...
Analyzing shared references across papers
Loading...
Qi Zhu
University of Shanghai for Science and Technology
Chuhang Zheng
Nanjing University of Aeronautics and Astronautics
Zheng Zhang
Hunan Institute of Science and Technology
IEEE Transactions on Affective Computing
Harbin Institute of Technology
Nanjing University of Aeronautics and Astronautics
Peng Cheng Laboratory
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhu et al. (Fri,) studied this question.
synapsesocial.com/papers/69de969c5e582bd3c5e939b0 — DOI: https://doi.org/10.1109/taffc.2023.3340924