Abstract Interactive recommendation systems (IRS) have become a prominent research topic as they dynamically optimize user experience through real-time feedback loops. To model the evolving dynamics of user preferences and maximize long-term rewards, reinforcement learning (RL) has been incorporated into IRS by formulating the recommendation process as a Markov decision process (MDP). However, RL policies trained on static offline data still face two major challenges: (1) distribution shift , where the mismatch between offline logs and dynamic online environments often leads to suboptimal long-term decision-making; and (2) sample efficiency , as the large action space in recommendation tasks requires substantial interaction before achieving optimal performance. To address these issues, we propose ARLK (Adaptive Reinforcement Learning with Large Language Models and Knowledge Graphs), a novel adaptive framework that combines large language model (LLM)-guided offline pretraining and knowledge graph (KG)-enhanced online learning via an adaptive policy fusion mechanism that smoothly transitions from offline initialization to online adaptation. LLMs provide strong semantic understanding that can capture user preferences and simulate interaction feedback, thereby improving policy pretraining and ensuring high-quality initial recommendations in simulation-based online evaluation. Meanwhile, the structured information in KGs is utilized during policy learning to guide candidate generation and significantly reduce exploration cost. Experiments on three benchmark datasets demonstrate that ARLK achieves substantial improvements in both initial recommendation quality and long-term performance compared with state-of-the-art baselines, with average reward improvements of 5.15%, 3.40%, and 1.80% on LFM, Industry, and Coat datasets, respectively, and up to 12.73% gain in Recall@10 on the Coat dataset.
Fan et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: