Reinforcement learning in continuous time: advantage updating | Synapse