What type of study is this?

September 10, 2025Open Access

A Comparative Study on Supervised Machine Learning Algorithms for Credit Card Transaction Fraud Detection

Key Points

Random Forest and XGBoost showed superior performance for credit card fraud detection, even without SMOTE.
Logistic Regression and SVM achieved high accuracy yet struggled with key classification metrics for fraud detection.
The study assessed the effects of the Synthetic Minority Over-sampling Technique (SMOTE) on model performance.
Comparison of traditional supervised learning models and ensemble methods shows varied effectiveness in fraud detection.

Abstract

The global cost of credit card fraud continues to rise, driven by the increasingly concentrated and sophisticated attacks. This situation underscores the necessity for more effective detection and prevention methods. In response to the growing need for better fraud detection and prevention, machine learning has witnessed significant advancements in recent years. This paper provides an overview and comparison of various models. On one hand, there are traditional supervised learning models, such as Logistic Regression, Decision Trees, and Support Vector Machines (SVM). On the other hand, ensemble methods like Random Forest, Gradient Boosting, and XGBoost are also covered. Given the highly imbalanced nature of credit card fraud datasets, the study also examines the impact of the Synthetic Minority Over-sampling Technique (SMOTE) on classification performance. While SMOTE has been shown to improve a models performance for weaker classifiers, its benefits for advanced ensemble methods remain less clear. Consequently, this paper will identify which models benefit most from oversampling and assess whether high-performing classifiers can mitigate the effects of imbalance without the need for data augmentation. When comparing the models performances, Random Forest and XGBoost demonstrated superior performance both with and without SMOTE. Without SMOTE, two models, Logistic Regression and SVM, yielded high accuracy but near-zero performance on key classification metrics, highlighting their inability to effectively detect minority class instances.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper