Key points are not available for this paper at this time.
In this paper, we present a video-based emotion recognition system submitted to the EmotiW 2016 Challenge. The core module of this system is a hybrid network that combines recurrent neural network (RNN) and 3D convolutional networks (C3D) in a late-fusion fashion. RNN and C3D encode appearance and motion information in different ways. Specifically, RNN takes appearance features extracted by convolutional neural network (CNN) over individual video frames as input and encodes motion later, while C3D models appearance and motion of video simultaneously. Combined with an audio module, our system achieved a recognition accuracy of 59.02% without using any additional emotion-labeled video clips in training set, compared to 53.8% of the winner of EmotiW 2015. Extensive experiments show that combining RNN and C3D together can improve video-based emotion recognition noticeably.
Building similarity graph...
Analyzing shared references across papers
Loading...
Fan Yin
University of South China
Xiangju Lu
iQIYI (China)
Dian Li
Changjiang Institute of Survey, Planning, Design and Research
iQIYI (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Yin et al. (Mon,) studied this question.
synapsesocial.com/papers/6a0f3f99f7e1df59726c9c8e — DOI: https://doi.org/10.1145/2993148.2997632
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: