Retrieval-Augmented Generation (RAG) systems have emerged as a promising solution to enhance large language models (LLMs) by integrating external knowledge retrieval with generative capabilities. While significant advancements have been made in improving retrieval accuracy and response quality, a critical challenge remains that the internal knowledge integration and retrieval-generation interactions in RAG workflows are largely opaque. This paper introduces RAGTrace, an interactive evaluation system designed to analyze retrieval and generation dynamics in RAG-based workflows. Informed by a comprehensive literature review and expert interviews, the system supports a multi-level analysis approach, ranging from high-level performance evaluation to fine-grained examination of retrieval relevance, generation fidelity, and cross-component interactions. Unlike conventional evaluation practices that focus on isolated retrieval or generation quality assessments, RAGTrace enables an integrated exploration of retrieval-generation relationships, allowing users to trace knowledge sources and identify potential failure cases. The system's workflow allows users to build, evaluate, and iterate on retrieval processes tailored to their specific domains of interest. The effectiveness of the system is demonstrated through case studies and expert evaluations on real-world RAG applications.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shu Qin Cheng
Jiaping Li
Southern University of Science and Technology
Huanchen Wang
City University of Hong Kong
Southern University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Cheng et al. (Sat,) studied this question.
synapsesocial.com/papers/68d90a0a41e1c178a14f6898 — DOI: https://doi.org/10.1145/3746059.3747741