Deep Neural Networks (DNNs) have attained remarkable prediction outcomes in image classification tasks, leading to significant progress in computer vision applications. However, the presence of adversarial examples has emerged as a critical challenge to the robustness and efficiency of deep learning-based image classifiers. Adversarial examples are specially designed perturbations applied to input images to deceive the models into generating inaccurate predictions while appearing indistinguishable to human observers. In this paper, we present a defense mechanism, namely Defensive Distillation with Gaussian Blurring (DDGB), that improves the robustness of deep learning models towards adversarial attacks. First, two models, a teacher and a student model, were utilized to train and validate the presented approach. The teacher model is trained and then leveraged to determine softened probabilities, which are later utilized to train the student model. Second, a feature-squeezing technique based on Gaussian blurring is applied to the adversarial examples generated from the distilled student model as a form of defense mechanism to make the adversarial perturbations less effective. The obtained findings demonstrate that the proposed approach is effective in improving the performance, achieving classification accuracies of 87.61% and 87.48% using the Fast Gradient Sign Method (FGSM) and Basic Iterative Method (BIM) attacks, respectively, based on the CIFAR-10 dataset. In summary, the presented approach achieves a 70.66% reduction in computations for the student model, allowing the model to be deployed on devices with limited resources and provide improved prediction accuracy towards adversarial attacks.
Ahmed et al. (Mon,) studied this question.