Multi-objective reinforcement learning using sets of pareto dominating policies | Synapse