With the rapid development of large language models (LLMs) in natural language processing and generation tasks, researchers are continuously validating their application potential in the field of medical artificial intelligence (AI). Generative Pre-trained Transformer 5 (GPT-5), Pathways Language Model 2 (PaLM-2), Large Language Model Meta AI 3 (Llama-3), Gemini-2.5, and various medical vertical models have demonstrated excellent performance in many medical and health-related tasks. However, LLMs still face key bottlenecks in real-world medical scenarios, including “illusions” slow knowledge updates, and a lack of interpretability. These problems severely restrict the safe deployment of large language models in high-risk medical scenarios and have become a major obstacle to the transition of medical artificial intelligence from research to clinical application. Retrieval-augmented generation (RAG) technology has therefore become an important solution for improving the credibility of medical LLMs. RAG significantly reduces the risk of information errors by retrieving authoritative medical knowledge before generation while maintaining knowledge updates without retraining. A review of research progress on RAG in the medical field is of great theoretical value and practical significance for promoting the design, evaluation, and standardized application of trustworthy medical AI. This review aims to systematically summarize the research progress of RAG in medical scenarios, including the technical frameworks, typical applications, and development trends of three types of methods—naive RAG, advanced RAG, and modular RAG—and further discuss their significance in medical reliability, health equity, and personalized medicine. The naive RAG architecture, through a standard “index-retrieval-generation” process, transforms external knowledge bases into a structured vector space that can be queried by an LLM, achieving positive results in electronic medical record (EMR) summary, preoperative assessment, and medical question answering (QA). However, naive RAG still has limitations in retrieval accuracy and cross-modal processing. To address these shortcomings, advanced RAG methods have been devised that significantly improve the model’s decision-making capabilities by improving retrieval strategies, enhancing inference chains, and introducing self-reflection mechanisms. Furthermore, modular RAG design offers composable, multi-module systems to support multi-source knowledge integration and complex task decomposition. In terms of application value, RAG’s contributions to the healthcare field are primarily reflected in these aspects. On the one hand, it significantly improves the reliability of medical AI by introducing traceable knowledge to reduce illusions. On the other hand, it helps promote healthcare equity by driving a shift toward a patient-centered healthcare model through localized knowledge bases and multilingual support. However, the deployment of RAG in medical scenarios still faces many challenges, such as data privacy risks. Future research should focus on improving self-supervised reflection capabilities and developing cross-modal knowledge fusion technologies. As research develops, RAG is expected to become a core foundational capability of future intelligent healthcare systems, further promoting the development of safe and efficient intelligent healthcare.
Building similarity graph...
Analyzing shared references across papers
Loading...
HAO Kechun
Bin Wang
Digital Medicine
Northeastern University
Building similarity graph...
Analyzing shared references across papers
Loading...
Kechun et al. (Tue,) studied this question.
www.synapsesocial.com/papers/699fe3af95ddcd3a253e7cb1 — DOI: https://doi.org/10.1097/dm-2025-00015