Sentiment classification plays a crucial role in analyzing customer feedback to identify market trends, enhance product recommendations, and improve customer satisfaction. This study focuses on sentiment analysis of Amazon reviews using two major datasets—Fine Food Reviews and Unlocked Mobile Reviews—which exhibit label imbalance. To address this challenge, both oversampling and undersampling techniques were applied to balance the datasets. Various machine learning (ML) algorithms, including Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), and Gradient Boosting Machine (GBM), as well as deep learning (DL) models such as Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and transformer-based models like RoBERTa, were implemented. After data cleaning and preprocessing, models were trained, and performance was evaluated. The results indicate that oversampling significantly enhances classification accuracy, particularly for the Fine Food dataset. Among ML models, Random Forest achieved the highest accuracy due to its ensemble approach and robustness in handling high-dimensional data. DL models, particularly RoBERTa, also demonstrated superior performance owing to their capacity to capture contextual dependencies. The findings emphasize the importance of data balancing for optimal sentiment analysis and contribute valuable insights toward advancing automated opinion classification in e-commerce applications.
Building similarity graph...
Analyzing shared references across papers
Loading...
Eman Daraghmi
Noora Zyadeh
Building similarity graph...
Analyzing shared references across papers
Loading...
Daraghmi et al. (Sat,) studied this question.
www.synapsesocial.com/papers/69ada8b2bc08abd80d5bbd96 — DOI: https://doi.org/10.3390/fi18030138