March 3, 2026Open Access

On the Impact of Imbalance Handling Techniques in Deep Learning Models for Malware Detection

Key Points

Effective malware classification relies on balancing CNN learning with resampling techniques, improving detection accuracy.
ADASYN enhances decision boundaries in CNNs, while ROS+RUS methods risk overfitting starkly affecting model performance.
Imbalanced data presents significant challenges in deep learning models, particularly for malware classification.
Experimental results highlight the efficacy of different imbalance handling techniques within a CNN framework.

Abstract

Malware continues to be one of the top cybersecurity threats globally, ranking among the most critical threats in North America and Europe. Its rapid spread and increas- ing sophistication make accurate detection a top priority for organizations seeking to protect their infrastructure and sensitive data. Convolutional Neural Networks (CNNs), known for their strength in visual pattern recognition, have proven effective in detecting malware by converting malware files into images and leveraging their image-processing capabilities. However, one major challenge in applying CNNs to malware detection is the presence of imbalanced data, where certain malware classes are underrepresented. This study focuses on evaluating the impact of various imbalance handling techniques on CNN performance in the context of malware classification. Experimental results demonstrate that effective malware classification depends on balancing CNN learning with resampling. ADASYN sharpens decision boundaries, while ROS+RUS risk overfitting, requiring discriminative feature learning.

Bookmark

View Full Paper

Bookmark

View Full Paper

On the Impact of Imbalance Handling Techniques in Deep Learning Models for Malware Detection

Key Points

Abstract

Cite This Study