May 18, 2026Open Access

Addressing class imbalance in bankruptcy prediction: a cost-aware and explainable machine learning approach

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Corporate bankruptcy prediction is important for financial risk management, credit assessment, and early warning systems, yet it remains challenging under extreme class imbalance and asymmetric misclassification costs. This study develops a leakage-free and decision-oriented machine learning framework for bankruptcy prediction that integrates feature de-redundancy, cost-sensitive learning, probability calibration, threshold optimization, and explainable prediction within a unified evaluation pipeline. The empirical analysis is conducted on the Taiwanese Bankruptcy Prediction benchmark using repeated stratified cross-validation, two correlation thresholds, and multiple cost configurations. Rather than relying on discrimination metrics alone, the study evaluates model performance through precision–recall behavior, recall-oriented measures, calibration quality, expected misclassification cost, threshold stability, and statistical comparison. The revised results show that the Random Forest configuration consistently outperforms the Logistic Regression baseline in decision-oriented evaluation. Across alternative cost ratios, model rankings remain stable, and the preferred configuration yields lower expected cost, stronger minority-class detection, and more stable threshold behavior. Calibration analysis further indicates that predicted probabilities remain compressed under severe imbalance, highlighting the importance of combining probability calibration with cost-sensitive threshold selection. Overall, the study contributes a more rigorous and practically relevant benchmark framework for bankruptcy prediction in settings where model outputs support financially consequential decisions.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo