AbstractRetrieval-Augmented Generation (RAG) has become a widely adopted method for deploying Large Language Models (LLMs) in enterprise environments due to its ability to ground outputs in organisational knowledge bases and reduce hallucinations. However, RAG introduces a distinct vulnerability: prompt injection attacks embedded within retrieved documents. In such attacks, adversarial instructions placed inside documents override system policies and cause harmful model behaviour including data leakage, policy violation, misinformation, or unsafe tool execution. This preprint proposes a benchmark-driven framework for evaluating prompt injection robustness in enterprise RAG assistants. It defines enterprise threat models covering insider document poisoning, supply chain document injection, and external user-upload scenarios. The paper proposes dataset construction methodology for adversarial document-query pairs, evaluation tasks, and security metrics such as Injection Success Rate, Policy Violation Rate, Confidentiality Leakage Score, and Grounding Accuracy. Practical mitigation strategies are reviewed including instruction boundary enforcement, retrieval filtering, sanitisation, and verification-based generation. The work supports secure deployment of RAG systems in regulated environments such as finance, healthcare, and public services. Keywords: Retrieval-Augmented Generation, Prompt Injection, LLM Security, Enterprise AI, Cybersecurity, Benchmarking
Building similarity graph...
Analyzing shared references across papers
Loading...
Mohammed Faizan Sayeed
Building similarity graph...
Analyzing shared references across papers
Loading...
Mohammed Faizan Sayeed (Thu,) studied this question.
synapsesocial.com/papers/698829520fc35cd7a8849874 — DOI: https://doi.org/10.5281/zenodo.18496798
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: