Retrieval-Augmented Generation (RAG) improves the performance of Large Language Models (LLMs) by retrieving and integrating relevant information from external knowledge bases, which helps generate more accurate responses. However, RAG is vulnerable to retrieval poisoning attacks , where attackers can induce LLM to produce inaccurate responses by injecting malicious documents into the retrieval process. In this paper, we propose ShieldRAG , a novel defense framework designed to counteract retrieval poisoning attacks by reshaping the retrieval embedding space. ShieldRAG leverages a dual-strategy effect realized via a majority-consensus mechanism: ① Push: Implicitly forces the embedding of a user query away from malicious documents by filtering out their minority signals, reducing their influence. ② Pull : Aligns the embedding of a user query closer to that of benign documents, reinforcing accurate retrieval. These strategies work synergistically to preserve retrieval integrity and enhance the quality of LLM-generated responses. Specifically, ShieldRAG operates through three key steps: Sliding Retrieval Explanation Generation , Keyword Aggregation , and Query Targeting Optimization . These three steps collectively ensure the effective integration of information from benign sources while filtering out malicious interference, thereby significantly enhancing the robustness of RAG systems against retrieval poisoning attacks. We evaluate ShieldRAG on four open-domain Question Answering (QA) datasets: Natural Questions, MS-MARCO, HotpotQA, and 2WikiMultiHopQA, using seven representative LLMs. Extensive experiments demonstrate that ShieldRAG significantly improves response accuracy while mitigating adversarial effects, showcasing strong generalization across multiple datasets and LLM architectures.
Building similarity graph...
Analyzing shared references across papers
Loading...
Longzhu He
Beijing University of Posts and Telecommunications
Xi Zhang
General Cardiology
Quan Liu
ACM Transactions on Information Systems
Beijing University of Posts and Telecommunications
Chongqing University of Posts and Telecommunications
Beijing Academy of Artificial Intelligence
Building similarity graph...
Analyzing shared references across papers
Loading...
He et al. (Fri,) studied this question.
synapsesocial.com/papers/69acc56732b0ef16a404f760 — DOI: https://doi.org/10.1145/3800948
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: