Los puntos clave no están disponibles para este artículo en este momento.
Deep neural networks (DNNs) have been widely adopted but they are vulnerable to intentionally crafted adversarial examples. Various attack methods against DNNs have been proposed, yet there still lacks theoretical explanation of adversarial examples. In this paper, we aim to understand adversarial examples from the attacking process and we assume adding perturbations to the key/sensitive regions of the image could fool image classification DNNs. We propose gradient shielding to verify the assumption which ignores insensitive information during generating adversarial examples. Specifically, we propose interactive gradient shielding (IGS) method which selects sensitive regions and then applies gradient-based attack. To remove region selection, we propose adaptive gradient shielding (AGS) method which ignores insensitive gradients automatically. We conduct extensive experiments to evaluate the performance and the results also corroborate our perspective. With this method, we won the first place in IJCAI-AAAC 2019 Non-targeted Adversarial Attack competition.
Gu et al. (Fri,) studied this question.