Physical adversarial attacks have advanced rapidly, with numerous methods developed to overcome the challenge of applying perturbations in real-world settings. However, less attention has been given to the challenge of information access. Most adversarial attacks operate in white-box settings or information-constrained black-box scenarios. Although prior work has explored universal adversarial examples and attacks without direct access to target networks, existing literature does not support the broad application of pre-existing adversarial methods in what we introduce as the "box-agnostic scenario". Unlike the black-box setting, which assumes access to both inputs and outputs of the target network, the box-agnostic scenario assumes knowledge only of the input image, with no access to classification outputs. To address this challenge, we introduce Multi-Targeted Gradient Training (MTGT), a novel approach that leverages encoder-decoder architectures trained on the combined gradients of multiple pretrained classifiers. By incorporating diverse architectures, MTGT captures a wide range of feature detectors, allowing feature-rich regions to emerge naturally during training. Additionally, we introduce a novel order-based loss function that optimizes training by emphasizing the most salient pixels in the combined gradients, guiding the network to focus on features most critical to successful attacks. This process enables the network to identify and exploit high-information areas within an image, facilitating adversarial attacks that target these regions rather than relying on any single network's gradients. We evaluate MTGT's effectiveness by testing its adversarial capabilities against networks outside the set used during training, demonstrating its potential for generating attacks that generalize across unseen architectures.
Hodes et al. (Wed,) studied this question.