Multimodal emotion recognition extracts emotional information from sequential multimodal data and classifies emotion tendencies. Current multimodal fusion methods based on artificial intelligence mainly rely on Transformers to extract features and integrate different data types. Despite their strength in learning global information, Transformers face challenges due to their quadratic complexity. Recent advances in state space models, especially the Mamba architecture, provide a promising solution by achieving global awareness with linear complexity. However, the potential of Mamba for information fusion in multimodal domains remains largely untapped. This paper introduces an innovative and efficient multimodal fusion and contrastive learning method called Fusion Mamba and Contrastive Learning, which leverages artificial intelligence for implementation and application in emotion recognition tasks. To effectively extract distinct features, the unimodal Mamba architecture is used to enhance unimodal representations. For comprehensive information fusion, the Mamba block is extended to handle dual inputs, forming a novel module called the Fusion Mamba block. This forms the basis for an architecture that incorporates three different modalities and three branches. Additionally, contrastive learning and interaction-level auxiliary classification constraints are jointly optimized to boost performance. The effectiveness of our approach, which highlights the application of artificial intelligence, is validated through experiments on three public datasets. Both quantitative and qualitative evaluations show that our method achieves state-of-the-art performance with 32.2% faster inference. Extensive ablation studies further confirm the effectiveness of the Mamba architecture in multimodal tasks.
Building similarity graph...
Analyzing shared references across papers
Loading...
Qianjun Shuai
Xiaohao Chen
Feng Hu
Complex & Intelligent Systems
University of Sunderland
Communication University of China
Building similarity graph...
Analyzing shared references across papers
Loading...
Shuai et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69fbef68164b5133a91a338e — DOI: https://doi.org/10.1007/s40747-026-02318-z