As machine learning is increasingly applied in healthcare, finance, and security, black-box adversarial attacks that manipulate inputs to mislead models without accessing internal details pose a growing security threat in real-world applications. This paper reviews black-box adversarial attacks by categorizing perturbation generation methods, such as gradient estimation and evolutionary search, along with attack targets, including directional and non-directional types. Besides, it distinguishes attack scenarios based on the attackers knowledge into white-box and black-box categories, analyzing their effects across different machine learning tasks and summarizing related challenges and opportunities. The results indicate that in image classification tasks, zero-order optimization (ZOO) demands a large number of queries to reach a high success rate, whereas evolutionary algorithms enhance defense effectiveness by about 30% but involve greater computational costs. For large-scale language models, template injection attacks achieve success rates greater than 80%, while the PathSeeker multi-agent reinforcement learning framework achieves over 80% detection evasion by dynamically adjusting templates. However, the effectiveness of cross-domain transfer is still severely constrained. Key directions include meta-learning with Bayesian optimization, cross-modal attack frameworks, and testing standards for high-risk settings.
Building similarity graph...
Analyzing shared references across papers
Loading...
Tianyi Huang
Applied and Computational Engineering
Shenzhen University
Building similarity graph...
Analyzing shared references across papers
Loading...
Tianyi Huang (Wed,) studied this question.
www.synapsesocial.com/papers/68d6c68eb1249cec298b2fe4 — DOI: https://doi.org/10.54254/2755-2721/2025.bj27247
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: