Key points are not available for this paper at this time.
Deep neural networks (DNNs) have found wide applications in various domains. However, recent research has revealed the vulnerability of DNNs to adversarial examples. Existing adversarial attack methods can easily mislead the models. Moreover, while traditional denoising techniques effectively deter certain attacks, they have limitations. To address this, We propose Text-guided Diffusion Model Purification (TGDP), an adversarial perturbation purification method based on diffusion models.This method preprocesses input images to purify adversarial perturbations. TGDP employs Protogen x3.4 (Photorealism) Official Release as a diffusion model for conditional image generation. During the generation process, text information is incorporated to enhance control over the diffusion model instead of relying entirely on the internal randomness of the model. By iteratively adding Gaussian noise to disrupt the adversarial examples and reversing the noise addition process to restore the image afterward, we can completely eliminate carefully crafted perturbations, achieving the purification objective. Extensive experiments on the ImageNet dataset against common adversarial attacks demonstrate that TGDP outperforms other defense methods applied to the ImageNet dataset.
Zhu et al. (Mon,) studied this question.