What question did this study set out to answer?

The research aims to enhance the performance of large language models in structured knowledge domains by addressing limitations in reasoning and accuracy.

March 13, 2026Open Access

Hybrid retrieval generation for structured reasoning with large language models

RMRathinasamy MuthusamiPennsylvania College of Health Sciences KSKandhasamy Saritha

Key Points

The research aims to enhance the performance of large language models in structured knowledge domains by addressing limitations in reasoning and accuracy.
Developed EduRAG-Compose, a hybrid retrieval–generation architecture.
Implemented hierarchical coarse-to-fine retrieval and dynamic knowledge graph traversal.
Utilized cache-augmented inference and compositional multi-step generation.
Evaluated performance on structured-domain benchmarks like ScienceQA and TQA.
Achieved 91.2% factual accuracy, surpassing existing models.
Improved domain alignment from 0.68 to 0.89.
Increased structural reasoning consistency from 0.61 to 0.87.
Maintained an average response latency of 4.8 seconds, comparable to baseline systems.

Abstract

Large Language Models (LLMs) exhibit strong generative capabilities but remain limited in structured knowledge domains due to factual inconsistency, shallow multi-hop reasoning, and weak alignment with domain constraints. To address these limitations, this paper presents EduRAG-Compose, a unified hybrid retrieval–generation architecture that integrates hierarchical coarse-to-fine retrieval, dynamic knowledge graph traversal, cache-augmented inference, and compositional multi-step generation into a single framework for structured domain reasoning. The architecture combines clustered dense retrieval for efficient evidence selection, graph-based dependency traversal for multi-hop inference, recursive retrieval–generation cycles for compositional reasoning, and a caching layer to reduce end-to-end latency. A domain-alignment classifier and a constrained decoding strategy further support structured and interpretable output generation consistent with underlying semantic dependencies. The framework is evaluated on structured-domain benchmarks including Science Question Answering (ScienceQA), Textbook Question Answering (TQA), and interaction-derived query sets from EdNet, focusing on retrieval quality, reasoning coherence, and generative consistency. Across these datasets, EduRAG-Compose achieves 91.2 ± 1.3% factual accuracy, outperforming vanilla Retrieval-Augmented Generation (RAG) at 78.4 ± 2.1%, ColBERT-based Retrieval-Augmented Generation (ColBERT-RAG) at 82.9 ± 1.9%, and Generative Pre-trained Transformer with retrieval (GPT + RAG) at 88.5 ± 1.5%. Domain-alignment performance improves from 0.68 ± 0.05 to 0.89 ± 0.03, while structural reasoning consistency increases from 0.61 ± 0.06 to 0.87 ± 0.04, with an average response latency of 4.8 ± 0.7 s, comparable to baseline retrieval systems and substantially lower than Chain-of-Thought (CoT) prompting. All reported improvements are statistically significant based on paired Wilcoxon signed-rank tests (p < 0.01). Qualitative reasoning traces further demonstrate that explicit retrieval paths and graph traversal sequences enhance interpretability and support human-in-the-loop verification. Rather than introducing new retrieval or generation algorithms, the primary contribution lies in the systematic integration of complementary retrieval and reasoning mechanisms into a single transparent architecture for structured domains. Limitations related to dataset scope, computational overhead, and text-only processing are acknowledged, and future work will explore multimodal extensions, computational optimization for real-time deployment, and adaptive reasoning mechanisms capable of incorporating continuous feedback or evolving knowledge sources. Unified Hybrid RAG Framework: Proposes EduRAG-Compose integrating hierarchical retrieval, graph reasoning, caching, and compositional generation. Statistically Proven Gains: Achieves significant improvements in factual accuracy and domain alignment over strong baselines (p < 0.01). Explainable Multi-Hop Reasoning: Enables transparent inference via explicit retrieval paths and graph traversal traces. Scalable Structured-Domain Design: Supports efficient and portable deployment across diverse structured knowledge domains.

AI에게 질문

Bookmark

View Full Paper

Cite This Study

Muthusami et al. (Tue,) studied this question.

synapsesocial.com/papers/69b3abd602a1e69014ccd0f7 https://doi.org/https://doi.org/10.1007/s44163-026-01059-9

AI에게 질문

Bookmark

View Full Paper