What question did this study set out to answer?

To address the issue of class imbalance in supervised learning by introducing Borderline Shifting as a novel resampling method.

March 7, 2026Open Access

An approach for handling imbalanced datasets using borderline shifting

Key Points

To address the issue of class imbalance in supervised learning by introducing Borderline Shifting as a novel resampling method.
Introduced a resampling technique called Borderline Shifting to enhance borderline instances.
Compared the new method against 7 established resampling techniques on 30 benchmark datasets.
Used classifiers including Random Forest, Naïve Bayes, and Support Vector Machine for evaluation.
Evaluated performance using metrics like F1-score, G-mean, and AUC.
Borderline Shifting consistently outperformed traditional methods like SMOTE and Borderline-SMOTE.
Achieved an average F1-score of 0.83 ± 0.06 and G-mean of 0.86 ± 0.05 with SVM.
Improved Naïve Bayes F1-score from 0.62 (baseline) to 0.78 ± 0.07.
Random Forest exhibited the highest G-mean of 0.88 ± 0.04 and stable AUC of 0.91 ± 0.03.

Abstract

In supervised learning tasks, class imbalance is a persistent problem that often leads to biased classification models that prioritize the majority class over the minority. To tackle this problem, we present a new resampling method called Bor- derline Shifting, which strengthens the model’s capacity to distinguish between classes close to the decision boundary by selectively enhancing significant borderline instances. Using a variety of 30 benchmark imbalanced datasets, this study com- pares the proposed method to 7 popular resampling techniques: Random Under Sampling (RUS), Random Over Sampling (ROS), SMOTE, Borderline-SMOTE, NearMiss, SMOTE-Tomek, and SMOTEENN. Performance was assessed using three well-known classifiers: Random Forest (RF), Naïve Bayes (NB), and Support Vector Machine (SVM). Evaluation metrics in- cluded F1-score, G-mean, AUC, recall, and precision. The findings show that the Borderline In every metric and classifier, the shifting method continuously produced better results. Our approach outperformed conventional methods like SMOTE and Borderline-SMOTE, achieving an average F1-score of 0.83 ± 0.06, G-mean of 0.86 ± 0.05, and AUC of 0.89 ± 0.04 with SVM. Our method significantly improved the F1-score from 0.62 (baseline) to 0.78 ± 0.07 and the AUC from 0.68 to 0.84 ± 0.06 of Naïve Bayes, which is usually sensitive to data imbalance. The robust Random Forest also benefited greatly: our approach produced the highest overall G-mean of 0.88 ± 0.04 and a stable AUC of 0.91 ± 0.03 with little variation between datasets. These findings show that the suggested Borderline Shifting approach not only solves the imbalance issue more successfully than current approaches but also improves classification performance in a consistent manner across various learning models. For real-world imbalanced learning scenarios, this makes it a viable and broadly applicable solution.

Bookmark

View Full Paper

Bookmark

View Full Paper

An approach for handling imbalanced datasets using borderline shifting

Key Points

Abstract

Cite This Study