Multi-armed bandit-based federated reinforcement learning for dynamic task offloading in a containerized MEC | Synapse