Optimal policy switching algorithms for reinforcement learning | Synapse