Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of unsupported or overconfident predictions without explicit linguistic evidence. To address this limitation, this study presents a hallucination-aware SA model by incorporating semantic grounding, interpretability-congruent supervision, and neuro-symbolic reasoning within a unified architecture. The proposed model is based on a fine-tuned Open Pre-trained Transformer (OPT) model, using three fundamental mechanisms: a Sentiment Integrity Filter (SIF), a SHapley Additive exPlanations (SHAP)-guided regularization technique, and a confidence-based lexicon-deep fusion module. The experimental analysis was conducted on two multi-class sentiment datasets that contain Twitter (now X) and Reddit posts. In Dataset 1, the suggested model achieved an average accuracy of 97.6% and a hallucination rate of 2.3%, outperforming the current transformer-based and hybrid sentiment models. With Dataset 2, the framework demonstrated strong external generalization with an accuracy of 95.8%, and a hallucination rate of 3.4%, which is significantly lower than state-of-the-art methods. These findings indicate that it is possible to include hallucination mitigation into transformer optimization without any performance degradation, offering a deployable, interpretable, and linguistically complex social media SA framework, which will enhance the reliability of neural systems of language understanding.
Sait et al. (Fri,) studied this question.