What does this research mean for the field?

Oversampling significantly enhances classification accuracy in sentiment analysis of Amazon product reviews, particularly for the Fine Food dataset. Novelty: ClaimNovelty.CONFIRMATORY. Consensus alignment: ConsensusAlignment.SUPPORTS_CONSENSUS.

What question did this study set out to answer?

To compare various machine learning and deep learning techniques for sentiment classification of Amazon product reviews.

March 8, 2026Open Access

Sentiment Classification of Amazon Product Reviews Based on Machine and Deep Learning Techniques: A Comparative Study

Key Points

To compare various machine learning and deep learning techniques for sentiment classification of Amazon product reviews.
Analyzed two datasets: Fine Food Reviews and Unlocked Mobile Reviews.
Used oversampling and undersampling techniques for data balancing.
Implemented ML algorithms like Random Forest, Logistic Regression, SVM, Naïve Bayes, and Gradient Boosting.
Applied DL models, including CNN, LSTM, and transformer-based RoBERTa.
Conducted data cleaning, preprocessing, training, and performance evaluation.
Oversampling improved classification accuracy, especially for the Fine Food dataset.
Random Forest displayed the highest accuracy among machine learning models due to its ensemble approach.
RoBERTa outperformed other deep learning models by effectively capturing contextual dependencies.

Abstract

Sentiment classification plays a crucial role in analyzing customer feedback to identify market trends, enhance product recommendations, and improve customer satisfaction. This study focuses on sentiment analysis of Amazon reviews using two major datasets—Fine Food Reviews and Unlocked Mobile Reviews—which exhibit label imbalance. To address this challenge, both oversampling and undersampling techniques were applied to balance the datasets. Various machine learning (ML) algorithms, including Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), and Gradient Boosting Machine (GBM), as well as deep learning (DL) models such as Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and transformer-based models like RoBERTa, were implemented. After data cleaning and preprocessing, models were trained, and performance was evaluated. The results indicate that oversampling significantly enhances classification accuracy, particularly for the Fine Food dataset. Among ML models, Random Forest achieved the highest accuracy due to its ensemble approach and robustness in handling high-dimensional data. DL models, particularly RoBERTa, also demonstrated superior performance owing to their capacity to capture contextual dependencies. The findings emphasize the importance of data balancing for optimal sentiment analysis and contribute valuable insights toward advancing automated opinion classification in e-commerce applications.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Daraghmi et al. (Sat,) studied this question.

synapsesocial.com/papers/69ada8b2bc08abd80d5bbd96 https://doi.org/https://doi.org/10.3390/fi18030138

KI fragen

Bookmark

View Full Paper