What type of study is this?

This is a Quantitative Study study.

October 20, 2025Open Access

Reinforcement Learning from User Feedback

Key Points

Using RLUF, large language models can be aligned with user preferences, improving overall satisfaction.
In A/B tests, the P[Love] model showed a remarkable 28% increase in positive feedback from users.
The framework addresses challenges in obtaining user feedback, which is often sparse and binary in nature.
Careful balancing is required to manage potential reward hacking when optimizing for positive user reactions.

Abstract

As large language models (LLMs) are increasingly deployed in diverse user facing applications, aligning them with real user preferences becomes essential. Existing methods like Reinforcement Learning from Human Feedback (RLHF) rely on expert annotators trained on manually defined guidelines, whose judgments may not reflect the priorities of everyday users. We introduce Reinforcement Learning from User Feedback (RLUF), a framework for aligning LLMs directly to implicit signals from users in production. RLUF addresses key challenges of user feedback: user feedback is often binary (e.g., emoji reactions), sparse, and occasionally adversarial. We train a reward model, PLove, to predict the likelihood that an LLM response will receive a Love Reaction, a lightweight form of positive user feedback, and integrate PLove into a multi-objective policy optimization framework alongside helpfulness and safety objectives. In large-scale experiments, we show that PLove is predictive of increased positive feedback and serves as a reliable offline evaluator of future user behavior. Policy optimization using PLove significantly raises observed positive-feedback rates, including a 28% increase in Love Reactions during live A/B tests. However, optimizing for positive reactions introduces reward hacking challenges, requiring careful balancing of objectives. By directly leveraging implicit signals from users, RLUF offers a path to aligning LLMs with real-world user preferences at scale.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Eun‐Jung Han

Jun Chen

Karthik Abinav Sankararaman

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Reinforcement Learning from User Feedback

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study