July 10, 2025

Comparative Analysis Research on Machine Learning Models in Credit Risk Assessment

Key Points

Integrated models XGBoost and LightGBM achieved a 94% accuracy rate and an AUC of 0.98 on both datasets, outperforming traditional methods.
Using SHapley Additive exPlanations, the study enhanced model interpretability, addressing complexities in credit risk assessment.
Employing the Synthetic Minority Over-Sampling Technique improved sample balance, fostering effective feature engineering and minimizing redundancy.
Analysis provides a robust foundation for high-precision credit risk control, indicating a potential for broader application across financial sectors.

Abstract

Credit risk assessment is crucial for the risk management and control of financial institutions, but it faces challenges such as sample imbalance, complex characteristics and the lack of model interpretability. In this study, two public datasets, "Give Me Some Credit" and "Loan Default", were used. The Synthetic Minority Over-Sampling Technique (SMOTE) was employed to balance the sample distribution and conduct feature engineering. Construct new features such as the income-debt ratio (IncomeDebtRatio) to reduce variable redundancy. Meanwhile, by comparing the model's different performance among logistic regression, Random Forest (RF), the study improves the training efficiency. The experiment results depict that the integrated models (XGBoost, LightGBM) perform better on both datasets, with an average accuracy rate of 94% and an AUC value of 0. 98 compared with the traditional models. Furthermore, SHapley Additive exPlanations (SHAP) values were used to develop the interpretability analysis. This study provides credit institutions with a high-precision and interpretable model construction scheme, and verifies the generalization ability of the model through cross-datasets, laying a theoretical and practical foundation for future credit risk control and the construction of an integrated system.

Bookmark

Cite This Study

Bohan Zhang (Thu,) studied this question.

synapsesocial.com/papers/68af55ccad7bf08b1eadc15b https://doi.org/https://doi.org/10.62051/19wa7a05

Bookmark