Retrieval-Augmented Generation (RAG) combined with Large Language Models (LLMs) introduces a new paradigm for clinical-trial data analysis that is both real-time and knowledge-traceable. This study targets a multi-site, real-world data environment. It builds a hierarchical RAG pipeline spanning an electronic health record (EHR), National Health Insurance (NHI) billing codes, and image-vector indices. The LLM is optimized through lightweight LoRA/QLoRA fine-tuning and reinforcement-learning-based alignment. The system first retrieves key textual and imaging evidence from heterogeneous data repositories and then fuses these artifacts into the contextual window for clinical report generation. Experimental results show marked improvements over traditional manual statistics and prompt-only models in retrieval accuracy, textual coherence, and response latency while reducing human error and workload. In evaluation, the proposed multimodal RAG-LLM workflow achieved statistically significant gains in three core metrics—recall, factual consistency, and expert ratings—and substantially shortened overall report-generation time, demonstrating clear efficiency advantages versus conventional manual processes. However, LLMs alone often face challenges such as limited real-world grounding, hallucination risks, and restricted context windows. Similarly, RAG systems, while improving factual consistency, depend heavily on retrieval quality and may yield incoherent synthesis if evidence is misaligned. These limitations underline the complementary nature of integrating RAG and LLM architectures in a clinical reporting context. Quantitatively, the proposed system achieved a Composite Quality Index (CQI) of 78.3, outperforming strong baselines such as Med-PaLM 2 (72.6) and PMC-LLaMA (74.3), and reducing the report drafting time by over 75% (p < 0.01). These findings confirm the practical feasibility of the framework to support fully automated clinical reporting.
Kuo et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: