What question did this study set out to answer?

The aim is to improve the alignment between input queries and retrieval processes in large language models.

March 5, 2026Open Access

RewriteGen: Autonomous Query Optimization for Retrieval-Augmented Large Language Models via Reinforcement Learning

Key Points

The aim is to improve the alignment between input queries and retrieval processes in large language models.
Proposed a framework called ReWriteGen integrating query rewriting and retrieval augmentation
Utilized reinforcement learning techniques including Group Relative Policy Optimization and Direct Preference Optimization
Conducted experiments on multi-hop QA benchmarks including HotpotQA, MuSiQue, and 2Wiki
ReWriteGen outperforms traditional retrieval-augmented generation baselines consistently
Achieved improvements of 5.32 and 5.10 percentage points on HotpotQA
Notable gains of 11.90 and 7.18 on MuSiQue
Significant improvements of 15.45 and 18.60 on 2Wiki

Abstract

Large Language Models (LLMs) have achieved substantial progress in knowledge-intensive tasks, particularly through Retrieval-Augmented Generation (RAG) frameworks. However, existing RAG systems often suffer from performance degradation when input queries are misaligned with retrieval requirements, and effectively coordinating retrieval with reasoning remains challenging—especially for multi-hop questions requiring iterative retrieval steps. To address these challenges, we propose ReWriteGen, a unified framework that integrates query rewriting, retrieval augmentation, and complementary generation within a coordinated architecture, optimized using reinforcement learning (Group Relative Policy Optimization, GRPO) and Direct Preference Optimization (DPO). ReWriteGen introduces a retrieval-aware query rewriting mechanism to better align input queries with external knowledge. The framework optimizes retrieval-augmented answers without requiring supervised reasoning annotations.Our experiments show that ReWriteGen consistently outperforms traditional RAG baselines across three multi-hop QA benchmarks: HotpotQA, MuSiQue, and 2Wiki. On HotpotQA, ReWriteGen achieves improvements of 5.32 and 5.10 percentage points in EM and LLM-based evaluation, respectively, compared to the strongest baseline. Corresponding gains of 11.90 and 7.18 are observed on MuSiQue, and 15.45 and 18.60 on 2Wiki.ReWriteGen enhances the coordination between retrieval and reasoning in LLMs, delivering consistent performance gains while reducing reliance on supervised reasoning annotations and extensive task-specific engineering.

RewriteGen: Autonomous Query Optimization for Retrieval-Augmented Large Language Models via Reinforcement Learning

Key Points

Abstract

Cite This Study