Los puntos clave no están disponibles para este artículo en este momento.
Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the...
ErnstDamien et al. (Thu,) studied this question.