Understanding consumer behavior in the context of online shopping is critical for businesses to adapt to evolving market trends. Customer reviews serve as a rich source of information reflecting consumer sentiments and preferences. Sentiment analysis of these reviews has become a powerful tool to uncover underlying consumer emotions and purchasing trends. However, traditional methods relying on shallow lexical features and classical machine learning algorithms often fall short in capturing the intricate and contextual patterns present in textual data. In this study, we propose the use of the large language model RoBERTa-Large to enhance sentiment classification performance by imposing its advanced contextual embeddings and attention mechanisms. This approach enables the capture of complex semantic relationships beyond surface-level word frequencies. Alongside sentiment analysis, we apply topic modeling using Latent Dirichlet Allocation (LDA) on publicly available datasets to identify prevalent themes and topics within consumer feedback. We perform a comprehensive comparison of RoBERTa against traditional machine learning and ensemble models using TF-IDF features, as well as deep learning architectures utilizing sentence embeddings and transformer-based models. Experimental results demonstrate that RoBERTa-Large achieves the highest accuracy of 93.59%, significantly outperforming baseline models. To enhance model transparency and trustworthiness, we apply SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) interpretability techniques, providing meaningful explanations of model predictions at both global and local levels.
Qi Shasha (Fri,) studied this question.