What question did this study set out to answer?

The aim is to address semantic alignment in robotic tasks using a neuro-symbolic framework that integrates planning and verification.

June 10, 2026Open Access

Neuro-symbolic Hierarchical Learning for Long-Horizon Robotic Tasks

Key Points

The aim is to address semantic alignment in robotic tasks using a neuro-symbolic framework that integrates planning and verification.
Developed a closed-loop, counterexample-guided synthesis framework unifying LLM-based planning and formal verification.
Converted natural language task descriptions into PDDL and generated high-level plans through a Guess-Check-Critique loop.
Co-optimized symbolic abstractions with low-level policies in behavior trees for robustness and executability.
Achieved superior performance on long-horizon tasks compared to state-of-the-art methods.
Provided improved interpretability of robotic executions and enhanced generalization capabilities.

Abstract

Recent advances in foundation models have motivated hybrid programming that integrates natural language descriptions, formal specifications, and executable code. A critical challenge in such systems lies in achieving semantic alignment across heterogeneous representations at different abstraction levels. This challenge is particularly pressing in programmatic reinforcement learning (PRL) for robotics, where long-horizon, sparse-reward tasks demand tight coordination between symbolic reasoning and continuous control. Existing approaches either rely on manually engineered symbolic representations or on LLM-generated plans that might be untrustworthy or infeasible to execute, leaving such fundamental gaps unaddressed. We present a closed-loop, counterexample-guided synthesis framework that unifies LLM-based planning, formal verification, and differentiable behavior tree (BT) synthesis for neuro-symbolic policy learning. The framework first converts natural language task descriptions into PDDL, then generates valid high-level plans via a Guess-Check-Critique loop that interleaves LLM generation with SMT-based verification. We automatically compile symbolic abstractions in verified plans into parameterized termination conditions, and co-optimize them with low-level policies to enforce semantic consistency. The learned sub-policies are composed into an integrated BT and fine‑tuned to ensure task-level executability. Furthermore, we perform closed-loop iterations of abstractions and compilations using feedback from verification and learning, while incrementally building a reusable skill library for efficient knowledge transfer. Experiments on challenging long-horizon robotic tasks show that our method exceeds state-of-the-art methods while providing interpretability and generalization.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper