October 30, 2017Open Access

PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Adversarial perturbations of normal images are usually imperceptible to, but they can seriously confuse state-of-the-art machine learning. What makes them so special in the eyes of image classifiers? In this, we show empirically that adversarial examples mainly lie in the low regions of the training distribution, regardless of attack types targeted models. Using statistical hypothesis testing, we find that modern density models are surprisingly good at detecting imperceptible image. Based on this discovery, we devised PixelDefend, a new approach purifies a maliciously perturbed image by moving it back towards the seen in the training data. The purified image is then run through unmodified classifier, making our method agnostic to both the classifier and attacking method. As a result, PixelDefend can be used to protect already models and be combined with other model-specific defenses. Experiments that our method greatly improves resilience across a wide variety of-of-the-art attacking methods, increasing accuracy on the strongest attack 63% to 84% for Fashion MNIST and from 32% to 70% for CIFAR-10.

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo

Cite This Study

Song et al. (Mon,) studied this question.

synapsesocial.com/papers/6a0ea429a14f152feaf9a4db https://doi.org/https://doi.org/10.48550/arxiv.1710.10766

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo