The transferability of adversarial examples across different models has drawn considerable attention recently, particularly in targeted transferability. Prior research has empirically shown that optimizing adversarial perturbations at neighboring points with the highest loss value improves transferability. While effective, such a method requires multiple iterations to reach the local maxima and disregards the local minima of the input loss landscape. In this paper, we theoretically show that enhancing adversarial transferability is attainable by flattening the input loss landscape. This is accomplished through the perturbation optimization at both local maxima and minima. Moreover, we propose the Cost-efficient LandscapE Flattening (CLEF) attack to consider local maxima and minima around current inputs in a cost-efficient way to flatten the loss landscape and improve adversarial transferability. Specifically, we reuse the gradients of the previous attack step to assist current inputs in reaching local maxima, and employ probabilistic modeling to learn the distributional representations of perturbations that assist current inputs in reaching local minima. This probabilistic modeling can be pre-trained on dozens of images from other domains, enabling us to directly sample this type of perturbation from the pre-trained distribution when attacking. Experimental results demonstrate that integrating local maxima and minima into targeted transferable attacks can significantly flatten the loss landscape of the crafted adversarial examples, resulting in improved adversarial transferability.
Wei et al. (Thu,) studied this question.