Decentralized multi-agent reinforcement learning based on best-response policies | Synapse