The rapid evolution of Domain Generation Algorithm (DGA)-driven attacks and obfuscated DNS traffic exposes fundamental weaknesses in conventional machine learning-based threat detection systems, particularly under adversarial manipulation. This study introduces FGM-GAN, a hybrid adversarial learning framework that synergistically combines gradient-based Fast Gradient Method (FGM) perturbations with adaptive Generative Adversarial Network (GAN)-based perturbations to improve both robustness and interpretability of deep neural networks for DNS threat classification. Unlike existing adversarial defenses that rely on model-specific perturbations, FGM-GAN explicitly learns class-conditional adversarial distributions for benign, phishing, and malware domains. This design enables the generation of realistic, feature-aligned perturbations that exhibit strong cross-model transferability. Experiments were conducted on the 32-feature CIC-BELL-DNS-2021 dataset (approximately 7000 labeled samples) using 5-fold cross-validation, hybrid perturbations with and , and evaluated against baseline DNN, SVM, Random Forest, KNN, and Decision Tree classifiers using accuracy and robustness metrics. Comprehensive evaluation demonstrates that FGM-GAN consistently improves robustness across diverse adversarial attacks (FGM, PGD, MIM, C&W) while maintaining stable performance across folds. Ablation studies and reduced-capacity variants confirm that gains arise from the hybrid adversarial mechanism rather than over-parameterization or hyperparameter tuning, and statistical significance tests verify the reproducibility of results. To enhance transparency and operational trust, the framework integrates multi-level explainable AI analyses spanning feature, neuron, and layer representations. These analyses consistently identify a compact set of high-impact DNS features and reveal structured adversarial propagation patterns, showing that robustness emerges from semantically meaningful representation learning. Collectively, these findings position FGM-GAN as a scalable and interpretable adversarial learning solution that jointly addresses robustness, transferability, and explainability in real-world DNS-based cybersecurity environments. • FGM-GAN hybrid improves neural network robustness against adversarial attacks • GANs produce realistic, class-specific adversarial perturbations for DNS data • Adversarial transferability validated across KNN, SVM, Decision Trees, RF • Gradient-XAI interprets feature, neuron, and layer-level model vulnerabilities • Combines robustness and explainability for actionable cyber threat intelligence
Henna et al. (Sun,) studied this question.