Multimodal Sentiment Analysis (MSA) aims to fuse information from multiple modalities to achieve precise sentiment classification. Recently, the issue of uncertain missing modalities has become one of the new challenges in MSA. Previous studies have attempted to solve this issue by building information interactions on modality pairs consisting of two modalities. However, existing methods typically rely on interactions between paired modalities to compensate for missing information. Such representations struggle to accurately reconstruct true cross-modal semantics due to the absence of guidance from a third modality. Additionally, existing approaches have neglected the effective utilization of text modality and the complexity of the models is relatively high. To tackle the above issues, we propose a sequential translation-based MSA model (STMSA). This model incorporates two key designs. First, the text-centric bidirectional translation mechanism leverages the dominant role of the text modality in affective tasks to sequentially establish bidirectional mappings with the audio and video modalities. This mechanism fully explores the deep connections among the three modalities through semantic guidance from text, enabling cross-modal representations that more closely align with real affective distributions. Second, the low-complexity non-modal completion architecture performs distributed fitting on joint representations in a shared space using only an encoder-decoder, thereby avoiding complex missing-modality generation processes. Extensive experiments were conducted on two public datasets, CMU-MOSI and IEMOCAP, demonstrating that the proposed model outperforms 10 state-of-the-art baseline models.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yan Hai
Shanqi Lu
Zhizhong Liu
Scientific Reports
Yantai University
North China University of Water Resources and Electric Power
China Tourism Academy
Building similarity graph...
Analyzing shared references across papers
Loading...
Hai et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69fed0c1b9154b0b82877df8 — DOI: https://doi.org/10.1038/s41598-026-46910-2
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: