Multi-Discounting Reinforcement Learning Based on Reward Decomposition | Synapse