Predictive modeling in financial telemarketing is frequently hindered by severe class imbalance and the misalignment between standard academic evaluation metrics and real-world business utility. This study presents a cost-sensitive evaluation of predictive models utilizing the complete UCI Bank Marketing dataset. We compared a baseline Logistic Regression model against an advanced eXtreme Gradient Boosting (XGBoost) ensemble, employing the Synthetic Minority Over-sampling Technique (SMOTE) to address an 88. 7% majority class imbalance. While the XGBoost model achieved a high default accuracy of 90%, financial utility analysis revealed an “Accuracy Paradox”: the model was overly conservative, generating a simulated campaign ROI of 184, 440, which was outperformed by the simpler baseline (252, 600). To resolve this, we introduced a generalized Cost-Benefit Ratio (CBR) framework. By defining a banking scenario with a 500 reward for true positives and a 10 cost for false positives, the optimal decision boundary was mathematically shifted to 0. 02. This dynamic threshold tuning elevated the XGBoost model’s simulated ROI to 391, 200, maximizing relative campaign utility. Furthermore, feature importance analysis identified age and macroeconomic indicators as the primary drivers of subscription behavior. The results underscore the necessity of shifting from abstract statistical accuracy to actionable financial utility functions. This work provides a reproducible, cost-sensitive framework for financial institutions to dynamically manage technical debt, proactively limit marketing expenditures, and optimize customer acquisition in imbalanced datasets.
Alhadi et al. (Sun,) studied this question.