What question did this study set out to answer?

The aim is to develop a sentiment analysis model that effectively mitigates hallucinations in predictions and enhances interpretability.

January 21, 2026Open Access

Hallucination-Aware Interpretable Sentiment Analysis Model: A Grounded Approach to Reliable Social Media Content Classification

Key Points

The aim is to develop a sentiment analysis model that effectively mitigates hallucinations in predictions and enhances interpretability.
Incorporated semantic grounding and interpretability-congruent supervision into the model.
Utilized a fine-tuned Open Pre-trained Transformer (OPT) as the base architecture.
Implemented a Sentiment Integrity Filter (SIF) and SHAP-guided regularization technique.
Conducted experiments on two multi-class sentiment datasets from Twitter and Reddit.
Achieved an average accuracy of 97.6% and a hallucination rate of 2.3% on the first dataset.
Demonstrated strong generalization on the second dataset with an accuracy of 95.8% and a hallucination rate of 3.4%.
Outperformed traditional transformer-based sentiment models with no performance degradation.

Abstract

Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of unsupported or overconfident predictions without explicit linguistic evidence. To address this limitation, this study presents a hallucination-aware SA model by incorporating semantic grounding, interpretability-congruent supervision, and neuro-symbolic reasoning within a unified architecture. The proposed model is based on a fine-tuned Open Pre-trained Transformer (OPT) model, using three fundamental mechanisms: a Sentiment Integrity Filter (SIF), a SHapley Additive exPlanations (SHAP)-guided regularization technique, and a confidence-based lexicon-deep fusion module. The experimental analysis was conducted on two multi-class sentiment datasets that contain Twitter (now X) and Reddit posts. In Dataset 1, the suggested model achieved an average accuracy of 97.6% and a hallucination rate of 2.3%, outperforming the current transformer-based and hybrid sentiment models. With Dataset 2, the framework demonstrated strong external generalization with an accuracy of 95.8%, and a hallucination rate of 3.4%, which is significantly lower than state-of-the-art methods. These findings indicate that it is possible to include hallucination mitigation into transformer optimization without any performance degradation, offering a deployable, interpretable, and linguistically complex social media SA framework, which will enhance the reliability of neural systems of language understanding.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Sait et al. (Fri,) studied this question.

synapsesocial.com/papers/69706c87b6488063ad5c1a7b https://doi.org/https://doi.org/10.3390/electronics15020409

KI fragen

Bookmark

View Full Paper