What question did this study set out to answer?

This research aims to enhance conversational search by generating clarifying questions for ambiguous user queries.

May 16, 2026

Towards Exploring Mixed-Initiative Conversation Generation Based on Community Question Answering

Key Points

This research aims to enhance conversational search by generating clarifying questions for ambiguous user queries.
Developed a large language model-based three-stage framework using a community question answering dataset.
Extracted relevant contextual information from the user query.
Generated clarifying questions and refined conversations for coherence.
Our three-stage approach achieved improved recall compared to the baseline.
Competitive performance across recall, precision, nDCG, and MAP metrics.
Human evaluations indicated high-quality conversations and performance enhancement through fine-tuning.

Abstract

Conversational search (CS) addresses users’ information needs through multi-turn and context-aware interactions. Given that user queries are often ambiguous, the use of clarifying questions can effectively reduce uncertainty and enable a mixed-initiative conversational system. However, current datasets for clarifying questions remain limited in the following three aspects: (1) underrepresented multi-turn conversational data, (2) limited diversity, and (3) heavily reliance on crowdsourcing, thereby suffering from limitations such as high annotation cost. To address these issues, we propose a LLM-based three stage framework that relies on an existing community question answering (CQA) dataset. It encompasses: (1) extracting essential information from the initial user query with the relevant contextual information, (2) generating clarifying questions paired with corresponding answers, and (3) refining conversations to ensure coherence and a natural conversational flow. We assess our multi-stage method against a baseline that directly prompts LLMs to generate conversations in a single-step process, evaluating on an answer retrieval task using recall, precision, nDCG and MAP. Results show that our three-stage generation approach consistently outperforms the baseline particularly in recall, while also achieving competitive results across other metrics. Human and automatic evaluations further indicate the high quality of generated conversations and fine-tuning on them improves retrieval performance, highlighting the pipeline's potential.

AI에게 질문

Bookmark

AI에게 질문

Bookmark

Towards Exploring Mixed-Initiative Conversation Generation Based on Community Question Answering

Key Points

Abstract

Cite This Study