Quadrupedal wheel-legged robots possess exceptional mobility in complex terrains, but their robust locomotion control is severely hindered by the difficulty of accurate state estimation without external sensors. Existing reinforcement learning methods relying on two-stage imitation often suffer from representation collapse and information loss during sim-to-real transfer. To address these challenges, this paper proposes a novel end-to-end reinforcement learning framework for implicit state estimation, incorporating terrain and external force features. Inspired by internal model control, the proposed method leverages a history of purely proprioceptive observations to extract explicit kinematic responses, as well as implicit environmental and external force representations via prototypical contrastive learning, completely circumventing explicit terrain regression and the need for physical force sensors. Furthermore, a tailored composite reward function and a progressive curriculum training strategy with large-scale domain randomization are integrated to ensure dynamic stability and hardware safety. Extensive cross-simulator validations and real-world deployments demonstrate that the approach achieves highly agile and robust locomotion, including adaptive traversal over diverse terrains. Experiments show that the method significantly enhances robustness under external disturbances, notably reducing the lateral linear velocity tracking error from 0.2421 m/s to 0.1319 m/s. The proposed method realizes zero-shot sim-to-real transfer with superior sample efficiency, providing a reliable and universal control paradigm for wheel-legged robots in unstructured environments.
Dai et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: