What question did this study set out to answer?

This research aims to systematically characterize the exploitation gap in DreamerV3 when using sparse feedback.

May 1, 2026Open Access

When the World Model Lies: Measuring and Characterising Reward Exploitation in DreamerV3 under Sparse Feedback

Key Points

This research aims to systematically characterize the exploitation gap in DreamerV3 when using sparse feedback.
Characterized reward exploitation using four new metrics in DreamerV3 on AntMaze.
Monitored the imagined-to-real reward ratio and KL divergence throughout training.
Proposed and tested KL-aware mitigation strategies to address the exploitation gap.
Imagined-to-real reward ratio reached approximately 50x at 500k environment steps, with evaluation return below 0.05.
KL divergence collapse was a significant early warning indicator of exploitation with approximately 50k step lag (r = -0.91, p < 0.001).
Sparse context-kernel gating reduced the exploitation gap, while dense-reward signals eliminated it.

Abstract

Model-based reinforcement learning agents that plan entirely in imagination can achieve high imagined returns while completely failing the actual task — a failure mode we term the exploitation gap. We provide the first systematic characterisation of this gap in DreamerV3 on AntMaze, where the world model receives near-zero reward from real experience. Instrumenting the training loop with four new metrics, we show that the imagined-to-real reward ratio reaches approximately 50x at 500k environment steps while evaluation return stays below 0.05. We establish that KL divergence collapse is a leading indicator of exploitation onset with a approximately 50k step lag (r = -0.91, p < 0.001), providing an actionable early-warning signal. Comparing to the hierarchical baseline THICK, we show that sparse context-kernel gating reduces but does not eliminate the gap. A dense-reward ablation confirms that rich reward signal suppresses exploitation entirely. We propose three KL-aware mitigation strategies and release all experimental infrastructure for reproducibility.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Arkat Khassanov Arkat Khassanov

Actions

Institutions

Astana Medical University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

When the World Model Lies: Measuring and Characterising Reward Exploitation in DreamerV3 under Sparse Feedback

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider