Credit card fraud presents a persistent threat to financial institutions, exacerbated by the rise of digital payments and the complexity of fraudulent schemes. This study investigates machine learning (ML) approaches for fraud detection in severely imbalanced datasets, focusing on three key objectives: comparing classification and anomaly detection models under extreme class imbalance, identifying transaction features with the highest discriminative power, and optimizing decision thresholds using cost-sensitive evaluation to minimize business impact. Utilizing a dataset of 999 transactions with a fraud rate of 0. 2% (498. 5: 1 imbalance), we implemented supervised methods (logistic regression, random forest, gradient boosting) and unsupervised anomaly detection (Isolation Forest, One-Class SVM, Local Outlier Factor). Results show that ensemble-based models, particularly Gradient Boosting, achieved superior performance (AUC-ROC = 0. 956; AUC-PR = 0. 378) with perfect recall and improved precision relative to other methods. Feature analysis identified anonymized PCA-derived variables (V14, V10, V12) as the most discriminative indicators of fraudulent activity. Threshold optimization at 0. 9 minimized operational costs (2, 985) while maintaining full recall, yielding an estimated annual net benefit of 68, 985 and a return on investment of 186. 7%. This study contributes to the literature by integrating algorithm benchmarking, feature importance evaluation, and cost-sensitive threshold optimization in an end-to-end fraud detection framework. The findings underscore the importance of ensemble learning, imbalanced evaluation metrics (AUC-PR, precision, recall), and business-driven threshold calibration for developing effective and economically viable fraud prevention systems. Future research should explore larger datasets, adaptive learning to address concept drift, and explainable AI techniques to enhance interpretability and regulatory compliance.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sezai Tunca
Alanya Hamdullah Emin Pasa University
Yavuz Selim Balcıoğlu
Doğuş University
Ceren Çubukçu Çerasi
Gebze Technical University
International Journal of Basic and Applied Sciences
Building similarity graph...
Analyzing shared references across papers
Loading...
Tunca et al. (Tue,) studied this question.
synapsesocial.com/papers/68d461d231b076d99fa615ca — DOI: https://doi.org/10.14419/m6x6fn74