Pulse Journal Club Active Debates Trending Explore Questions Researchers

Download the App

Join discussions, follow papers, and never miss your next session.

Download on theApp Store

© Synapse Social LLC, 2026

Home Explore Journal Club Trending

⌘+K

SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets | Synapse

July 28, 2019Open Access

SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets

Key Points

Key points are not available for this paper at this time.

Abstract

Reinforcement learning methods for recommender systems optimize recommendations for long-term user engagement. However, since users are often presented with slates of multiple items---which may have interacting effects on user choice---methods are required to deal with the combinatorics of the RL action space. We develop SlateQ, a decomposition of value-based temporal-difference and Q-learning that renders RL tractable with slates. Under mild assumptions on user choice behavior, we show that the long-term value (LTV) of a slate can be decomposed into a tractable function of its component item-wise LTVs. We demonstrate our methods in simulation, and validate the scalability and effectiveness of decomposed TD-learning on YouTube.

Mark Helpful

Bookmark

Relay

View Full Paper

Mark Helpful

Bookmark

Relay

View Full Paper

Cite This Study

Ie et al. (Sun,) studied this question.

synapsesocial.com/papers/6a08a7981e0fcf4a43e8e3e0 https://doi.org/https://doi.org/10.24963/ijcai.2019/360