Video-based emotion recognition using CNN-RNN and C3D hybrid networks

Key Points

Key points are not available for this paper at this time.

Abstract

In this paper, we present a video-based emotion recognition system submitted to the EmotiW 2016 Challenge. The core module of this system is a hybrid network that combines recurrent neural network (RNN) and 3D convolutional networks (C3D) in a late-fusion fashion. RNN and C3D encode appearance and motion information in different ways. Specifically, RNN takes appearance features extracted by convolutional neural network (CNN) over individual video frames as input and encodes motion later, while C3D models appearance and motion of video simultaneously. Combined with an audio module, our system achieved a recognition accuracy of 59.02% without using any additional emotion-labeled video clips in training set, compared to 53.8% of the winner of EmotiW 2015. Extensive experiments show that combining RNN and C3D together can improve video-based emotion recognition noticeably.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Fan Yin

University of South China

Xiangju Lu

iQIYI (China)

Dian Li

Changjiang Institute of Survey, Planning, Design and Research

Actions

Institutions

iQIYI (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yin et al. (Mon,) studied this question.

synapsesocial.com/papers/6a0f3f99f7e1df59726c9c8e — DOI: https://doi.org/10.1145/2993148.2997632

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

An Application of Recurrent Neural Networks to Discriminative Keyword Spotting· 2007 · 261 citations
Going deeper with convolutions· 2015 · 46,766 citations
Deep Face Recognition· 2015 · 5,068 citations
ImageNet: A large-scale hierarchical image database· 2009 · 61,675 citations
2005 IEEE International Conference on Acoustics, Speech, and Signal Processing· 2005 · 373 citations

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

An Application of Recurrent Neural Networks to Discriminative Keyword Spotting· 2007 · 261 citations
Going deeper with convolutions· 2015 · 46,766 citations
Deep Face Recognition· 2015 · 5,068 citations
ImageNet: A large-scale hierarchical image database· 2009 · 61,675 citations
2005 IEEE International Conference on Acoustics, Speech, and Signal Processing· 2005 · 373 citations

Video-based emotion recognition using CNN-RNN and C3D hybrid networks

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider