Actor-Critic--Type Learning Algorithms for Markov Decision Processes | Synapse