May 30, 2024Open Access

Q-learning as a monotone scheme

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Stability issues with reinforcement learning methods persist. To better understand some of these stability and convergence issues involving deep reinforcement learning methods, we examine a simple linear quadratic example. We interpret the convergence criterion of exact Q-learning in the sense of a monotone scheme and discuss consequences of function approximation on monotonicity properties.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Lingyi Yang (Thu,) studied this question.

synapsesocial.com/papers/68e67cafb6db64358760643f https://doi.org/https://doi.org/10.48550/arxiv.2405.20538

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

1Toward a Unified Lyapunov-Certified ODE Convergence Analysis of Smooth Q-Learning with p-Norms2024 · 1 citations
2Two-Step Q-Learning2024
3When Q-Learning fails: unstable behavior for infinite state spaces2026
4On the Stability of Learning in Network Games with Many Players2024
5Towards Formalizing Reinforcement Learning Theory2025

Me gusta

Guardar

Ver artículo completo