What question did this study set out to answer?

The aim is to improve the robustness and explainability of malware classifiers using adversarial training with synthetic examples.

May 17, 2026Open Access

Improving robustness and explainability of PE malware classifiers using GAN-Generated Synthetic Adversarial examples

Key Points

The aim is to improve the robustness and explainability of malware classifiers using adversarial training with synthetic examples.
Adversarial Training with Synthetic Augmentation (ATS) was developed using Conditional Tabular GAN-generated examples.
Three classifiers evaluated: Random Forest, Light Gradient Boosting Machine (LightGBM), and Multilayer Perceptron (MLP).
Performance assessed against five PE adversarial attacks: Full DOS, EXTEND, SHIFT, FGSM-padding, and GAMMA.
ATS outperformed standard adversarial training in enhancing robustness across all classifiers and attacks.
Clean accuracy and F1-score were maintained or improved compared to standard adversarial training.
Interpretability analysis showed reduced reliance on low-level features, improving attention to stable PE features.

Abstract

Abstract Adversarial machine learning has exposed critical vulnerabilities in Artificial Intelligence-based Windows Portable Executable (PE) malware detection. A well-crafted small perturbation to a PE malware binary can cause it to be misclassified as goodware. Adversarial Training (AT) is one of the most effective defenses; however, it is not always sufficient alone and often suffers from the robustness–accuracy trade-off. This study proposes Adversarial Training with Synthetic Augmentation (ATS), a novel defense methodology that augments unrealistic synthetic adversarial examples into the standard adversarial training, produced using the Conditional Tabular GAN (CTGAN). The robustness and resilience of Random Forest, Light Gradient Boosting Machine (LightGBM), and Multilayer Perceptron (MLP) classifiers were evaluated against five realistic Windows PE adversarial attacks: Full DOS, EXTEND, SHIFT, FGSM-padding, and GAMMA. Results show that the ATS methodology consistently outperformed AT in enhancing robustness across all attacks and classifiers while maintaining or improving clean accuracy and F1-score. SHAP-based interpretability analysis further reveals that ATS reduces dependence on attack-sensitive low-level features and increases attention to stable PE features. Overall, ATS provides a model-agnostic enhancement to standard AT, effectively reducing false negatives without compromising clean accuracy.

Bookmark

View Full Paper

Bookmark

View Full Paper

Improving robustness and explainability of PE malware classifiers using GAN-Generated Synthetic Adversarial examples

Key Points

Abstract

Cite This Study