What type of study is this?

September 5, 2025

Self-supervised Guided Modality Disentangled Representation Learning for Multimodal Sentiment Analysis and Schizophrenia Assessment

Key Points

The model demonstrates superior performance in multimodal sentiment analysis, leveraging self-supervised learning for better representation.
Achieving state-of-the-art results, the method effectively integrates modality-specific features while minimizing noise.
Experimental evaluation on benchmark datasets showcases the model's strength in both sentiment analysis and schizophrenia assessment.
This algorithm suggests that disentangled representation learning could significantly advance mental health diagnostics in real-world scenarios.

Abstract

As the impact of chronic mental disorders increases, multimodal sentiment analysis (MSA) has emerged to improve diagnosis and treatment. In this paper, our approach leverages disentangled representation learning to address modality heterogeneity with self-supervised learning as a guidance. The self-supervised learning is proposed to generate pseudo unimodal labels and guide modality-specific representation learning, preventing the acquisition of meaningless features. Additionally, we also propose a text-centric fusion to effectively mitigate the impacts of noise and redundant information and fuse the acquired disentangled representations into a comprehensive multimodal representation. We evaluate our model on three publicly available benchmark datasets for multimodal sentiment analysis and a privately collected dataset focusing on schizophrenia counseling. The experimental results demonstrate state-of-the-art performance across various metrics on the benchmark datasets, surpassing related works. Furthermore, our learning algorithm shows promising performance in real-world applications, outperforming our previous work and achieving significant progress in schizophrenia assessment.

Mark Helpful

Bookmark

Relay