Sample-efficient batch reinforcement learning for dialogue management optimization | Synapse