Abstract A fundamental limitation of modern conversational AI is its limited capacity to demonstrate sustained empathy in long-form interactions. We propose SCIRAG (Semantic Context Improvisational Retrieval-Augmented Generation), a feedback-driven retrieval framework for adaptive empathic dialogue. It employs a dual-loop retrieval framework, iteratively optimizing a static counseling dataset through user metadata and feedback memory refinement. To enhance contextual alignment, we deploy retrieval adaptation, enabling the model to retain and leverage past conversational cues based on user preferences. When integrated with Mixtral-8x7B, SCIRAG improves human-rated empathic understanding by +1.26 points and empathic response by +1.00 point on the RoPE scale, while increasing acceptability by +7.66 points compared to a fine-tuned non-RAG baseline. Automatic evaluation further shows gains in semantic alignment (BERTScore-F1 +0.11) and fluency (perplexity reduced from 19.1 to 12.3). We also present EMPATHIC , a dataset of unscripted, therapeutic conversations. Unlike conventional datasets that contain only 2-4 dialogue turns per conversation, EMPATHIC provides extended conversational trajectories (50+ turns per session), allowing models to learn long-range coherence and empathic listening. The proposed dataset will be publicly released.
Tahir et al. (Fri,) studied this question.