What question did this study set out to answer?

The aim is to develop a robust approach for bipedal running using reinforcement learning that can adapt to various terrains.

January 14, 2026

Learning robust bipedal running via structured gait and trajectory guidance

Key Points

The aim is to develop a robust approach for bipedal running using reinforcement learning that can adapt to various terrains.
Developed a reinforcement learning framework for bipedal running.
Implemented a reference trajectory generator using kinematics for policy exploration.
Utilized an asymmetric actor-critic architecture to enhance learning effectiveness.
Extracted latent variables from historical observations to bridge the simulation-reality gap.
Conducted extensive simulations and physical experiments to validate the approach.
Achieved higher agility and accurate velocity tracking compared to existing methods.
Demonstrated stronger disturbance rejection while maintaining gait stability.
Exhibited effective spring-mass dynamics on varied terrains.

Abstract

Abstract Legged robots have demonstrated remarkable potential for dynamic locomotion and terrain adaptability, making them a prominent focus of research. However, achieving robust and agile bipedal running remains challenging due to the complex dynamics of legged locomotion. In this paper, we propose a reinforcement learning framework for robust bipedal running, incorporating a simple reference trajectory generator and an asymmetric actor-critic architecture. The reference generator, based on kinematics, provides diverse trajectory references while preserving key gait characteristics, facilitating efficient policy exploration. To mitigate the simulation-to-reality gap, we extract latent variables encoding environmental and motion information from dual historical observations. Our method simplifies the trajectory generation process while maintaining effective guidance for learning. Extensive simulation and physical experiments demonstrate that, compared to model-based and learning-based baselines, our approach achieves higher agility, more accurate velocity tracking, and stronger disturbance rejection while preserving gait stability. The resulting controller exhibits spring–mass running dynamics that remain robust on both flat and uneven terrains.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yunpeng Liang

Zhihui Peng

Yanzheng Zhao

Journals

Robotica

Actions

Institutions

Shanghai Jiao Tong University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Learning robust bipedal running via structured gait and trajectory guidance

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study