What question did this study set out to answer?

This research aims to address the structural conflict between neural generation and deterministic execution in AI systems.

April 25, 2026Open Access

View Full Paper

Mitigating Execution Hallucinations and Computational Inflation in Agentic RAG via Strict Protocol Boundaries

HZHaitao ZhangHunan Agricultural University DLDan LiHunan Agricultural University XNXiaoyi NieHunan Agricultural University

Key Points

This research aims to address the structural conflict between neural generation and deterministic execution in AI systems.
Proposes RAG-CoT-MCP, a neuro-symbolic architecture that separates cognitive planning from execution.
Evaluates the approach across four different datasets using a multi-dimensional LLM framework and rigorous ablation studies.
Tracks computational costs and error rates meticulously to gauge performance improvements.
Reduces execution error rates from 45.2% in unconstrained models to 6.0% with the proposed architecture.
Achieves significant enhancements in semantic comprehensiveness and logical coherence compared to existing methods.
Decreases overall inference latency while reducing redundant token consumption.

Abstract

The deployment of large language models as autonomous retrieval agents over unstructured knowledge bases gives rise to a persistent structural conflict between probabilistic neural generation and deterministic physical execution. While agentic paradigms facilitate complex multi-hop retrieval, their unconstrained generative nature frequently violates strict syntactic requirements. This systemic vulnerability directly triggers execution hallucinations, such as fabricated API parameters or malformed schemas. Consequently, these syntax-driven failures force systems into redundant trial-and-error recovery loops, resulting in severe computational inflation that degrades both token efficiency and inference latency. To resolve this reliability–efficiency dilemma, this paper proposes RAG-CoT-MCP, a neuro-symbolic architecture that orthogonally decouples probabilistic cognitive planning from deterministic tool execution. By integrating the Model Context Protocol (MCP) as a strict system-level validation boundary, the framework ensures that latent reasoning trajectories manifest exclusively as syntactically valid operations. Exhaustive empirical evaluations across four disparate datasets—incorporating a multi-dimensional LLM-as-a-Judge framework, rigorous ablation studies, and granular cost tracking—validate the proposed approach. The findings demonstrate that RAG-CoT-MCP compresses network-level execution error rates from 45.2% (in unconstrained baselines) to a mere 6.0%, yielding substantial enhancements in semantic comprehensiveness and logical coherence compared to existing baselines. Counterintuitively, by proactively intercepting malformed actions and redirecting computational resources from reactive error handling to valid causal deduction, the framework drastically reduces redundant token consumption and achieves the lowest overall inference latency. Ultimately, this study establishes that deterministic execution constraints do not hinder agentic flexibility; rather, they serve as a fundamental prerequisite for deploying robust, high-speed, and cost-effective knowledge retrieval systems.

Ask AI

Helpful

Bookmark

View Full Paper

Ask AI

Helpful

Bookmark

View Full Paper

Mitigating Execution Hallucinations and Computational Inflation in Agentic RAG via Strict Protocol Boundaries

Key Points

Abstract

Cite This Study