September 24, 2025Open Access

Concealment of Intent: A Game-Theoretic Analysis

Key Points

The proposed intent-hiding adversarial prompting effectively conceals malicious intent and undermines defenses.
This analysis identifies equilibrium points showing structural advantages for attackers in LLM interactions.
A tailored defense mechanism against intent-hiding attacks is proposed and tested against multiple LLMs.
Empirical tests validate the attack's effectiveness over existing adversarial techniques across various malicious behaviors.

Abstract

As large language models (LLMs) grow more capable, concerns about their safe deployment have also grown. Although alignment mechanisms have been introduced to deter misuse, they remain vulnerable to carefully designed adversarial prompts. In this work, we present a scalable attack strategy: intent-hiding adversarial prompting, which conceals malicious intent through the composition of skills. We develop a game-theoretic framework to model the interaction between such attacks and defense systems that apply both prompt and response filtering. Our analysis identifies equilibrium points and reveals structural advantages for the attacker. To counter these threats, we propose and analyze a defense mechanism tailored to intent-hiding attacks. Empirically, we validate the attack's effectiveness on multiple real-world LLMs across a range of malicious behaviors, demonstrating clear advantages over existing adversarial prompting techniques.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xinbo Wu

Tongji University

Abhishek K. Umrawal

University of Maryland, Baltimore

Lav R. Varshney

University of Illinois Urbana-Champaign

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Concealment of Intent: A Game-Theoretic Analysis

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider