What question did this study set out to answer?

The study aims to formalize Tool Output Mimicry and its impact on AI orchestration systems.

April 28, 2026Open Access

Tool Output Mimicry: Bypassing Multi-Layer Agentic AI Defenses via Upstream-Agent Impersonation in User-Controlled Fields

Key Points

The study aims to formalize Tool Output Mimicry and its impact on AI orchestration systems.
Refinement of indirect prompt injection targeting inter-agent trust boundaries.
Empirical validation against the OWASP FinBot CTF using the attack chain.
Combination of mimicry with tool-description poisoning and gate passing side-channel artefact.
Successful capture of the fine-print challenge leading to an unauthorized US$8,000 transfer.
The attack chain was the only method to succeed among 20+ attempts.
Demonstrated the need for improved security measures focused on authenticated task summaries.

Abstract

We document Tool Output Mimicry, a refinement of indirect prompt injection that targets the inter-agent trust boundary in multi-agent AI orchestration systems. The technique exploits a structural property of modern orchestrators: each agent's task summary is forwarded as authoritative context to the next agent in the pipeline, with the orchestrator's system prompt typically directing "pass the FULL task summary forward, do not summarize or filter". A user-controllable field that the downstream agent will read — an invoice description, a vendor profile field, document content — can be crafted to impersonate the structured output of an upstream agent. The downstream agent then issues redirected tool calls without violating its prompt-level guardrails. The mimicry primitive is not auto-sufficient: in the layered-defense setting we study, it must be combined with tool-description poisoning (a known supply-chain attack, OWASP LLM03) and a gate passing side-channel artefact. The combined attack chain succeeds where each component alone fails. We empirically validate the chain against the OWASP FinBot CTF, where it was the only technique among twenty-plus attempts in two engagement sessions to capture the fine-print challenge — causing a payment-processor agent to issue a US8, 000 transfer against a US5, 000 invoice. Contributions: - Formalization of Tool Output Mimicry as a refinement of indirect prompt injection specialised to the inter-agent trust boundary, with format-fidelity and hierarchy-fidelity properties. - Reusable attack-construction templates for payment override, vendor-status manipulation, and arbitrary downstream tool redirection. - Empirical validation against the OWASP FinBot CTF (single capture of the fine-print challenge across 20+ attempts). - Mitigations centered on authenticated task summaries (M1) and cross-agent value continuity (O2), with honest discussion of the limits of single-engagement empirical evidence. Keywords: agentic AI security, multi-agent orchestration, indirect prompt injection, inter-agent trust, Model Context Protocol, OWASP LLM Top 10, MITRE ATLAS, supply chain security. Companion materials: reference implementation in the Ai-EGIS toolkit (i-314 Security Research). Markdown sources (EN/ES) and LaTeX/BibTeX sources are included in the source tarball.

Tool Output Mimicry: Bypassing Multi-Layer Agentic AI Defenses via Upstream-Agent Impersonation in User-Controlled Fields

Key Points

Abstract

Cite This Study