Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation | Synapse