Exploiting the morphological symmetry of robotic systems, such as humanoid and quadruped robots, is a promising direction for improving robot learning. In deep reinforcement learning (DRL) for robot control, prior studies have leveraged such symmetry to improve learning efficiency through data augmentation, equivariant multilayer perceptrons (EMLPs), and multi-agent reinforcement learning (MARL) formulations. However, DRL training is inherently unstable, as the data distribution strongly depends on exploration, which is driven by stochasticity in the environment. To address this issue, we propose a symmetry-assisted, general-purpose DRL framework for morphologically symmetric robots that enables stable and robust learning. The framework models the environment as a symmetric Markov decision process (MDP) and constructs a full-body policy from a single-sided base policy using symmetry operators. We further propose a symmetric PPO objective with a coupled importance-sampling ratio. This objective aligns the policy optimization process with the imposed symmetry and serves as a principled alternative to MAPPO-style multi-agent formulations. Experimental results demonstrate that the proposed method outperforms existing approaches on most symmetric tasks, while still maintaining performance comparable to or better than standard PPO on asymmetric tasks, where symmetry is less directly exploitable.
Hakoda et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: