Los puntos clave no están disponibles para este artículo en este momento.
Abstract Existing multimodal sentiment analysis relies on static fusion and fixed GRU (Gated Recurrent Unit) structure, which is difficult to deal with temporal misalignment, noise, and long-distance dependency, and the hierarchical configuration is time-consuming and has poor generalization. Therefore, this paper proposes a cross-modal interaction with temporal adaptive attention and adjustable gated hierarchical recurrent network, combined with Bayesian structural optimization and adversarial multi-task training, and uses visualization to improve interpretability. In this study, the text, speech, and visual features are first linearly projected, and the cross-modal cross attention is calculated. The three-way attention output is accumulated in time and input into the Adaptive GRU. A learnable gate sensitivity parameter is added to the GRU, and the forget gate and update gate bias are dynamically adjusted in combination with real-time noise estimation. Subsequently, based on the Bayesian optimization of TPE (Tree-structured Parzen Estimator) implemented by Optuna, the optimal hierarchical structure is automatically searched in the 2–5 layer and 128–512 dimensional hidden unit space, and the sentiment classification performance is evaluated to select the best configuration. Finally, an adversarial multi-task framework is constructed to train multimodal sentiment classification and noise recognition adversarial loss weightedly. Experiments show that the average accuracy of the proposed method in 5-fold cross-validation is 82.6% (static fusion 75.0%); the weighted F1-score is 0.86 (Attention-based 0.815); the ROC-AUC (Receiver Operating Characteristic - Area Under Curve) is 0.84 ± 0.02; the specificity reaches 0.854, which is significantly better than static fusion and single-modal methods, verifying the model’s efficiency and robustness in multimodal sentiment analysis tasks. Graphical abstract
Building similarity graph...
Analyzing shared references across papers
Loading...
Wangyang Shi
Wangyang Shi
University of the Philippines Manila
Anhui Polytechnic University
Anhui Business College
Building similarity graph...
Analyzing shared references across papers
Loading...
Shi et al. (Wed,) studied this question.
synapsesocial.com/papers/694033db2d562116f2907d27 — DOI: https://doi.org/10.1007/s42452-025-07901-6