What question did this study set out to answer?

This work aims to explore how prompt framing influences the performance of hybrid models combining reinforcement learning and large language models in robotics.

April 26, 2026Open Access

A platform for investigating prompt framing as interface parameters in foundation models for robotics

Key Points

This work aims to explore how prompt framing influences the performance of hybrid models combining reinforcement learning and large language models in robotics.
Developed a controlled experimental platform for evaluation.
Compared three agents: RL-only Q-learning, LLM-only action selection, and hybrid LLM + RL agent.
Conducted evaluations in a simulated navigation environment under a constrained interaction budget.
The hybrid LLM + RL agent achieved higher mean success and mean cumulative reward compared to both RL-only and LLM-only baselines.
Advisor-channel ablations showed reduced performance with random or null recommendations, highlighting structured advice importance.
Different prompt framings produced varied effects on decision making and task performance.

Abstract

Foundation models, in particular large language models (LLMs), are finding increasing popularity when used in describing goals for robotic control, decision making, and execution. Recently, proposals for hybrid paradigms leveraging strengths of reinforcement learning (RL) agents in tandem with LLMs for robotic control have been demonstrated. The interface between the RL agents and the language model however offers a unique opportunity to explore how prompt framing may affect such hybrid systems. This work presents a controlled experimental platform to measure and better understand how manipulation of the interface between RL agents and an LLM impacts behaviour of a hybrid advisor-arbiter architecture. We compared three agents under matched evaluation protocols and initializations in a simulated navigation environment: (i) RL-only tabular Q-learning; (ii) LLM-only (stateless) action selection; and (iii) a hybrid LLM + RL agent. Under a constrained interaction budget (10 episodes per world), the hybrid LLM + RL agent achieves higher mean success and higher mean cumulative reward than both RL-only and LLM-only baselines. Advisor-channel ablations (random recommendations and null recommendations) reduce performance, indicating that structured advice contributes beyond adding extra text. We further demonstrate prompt framing as a controlled factor by evaluating navigation-role personas, narrative personas, and relational variants of a caregiver prompt under matched conditions, yielding heterogeneous effects across framings. The contribution of this work is to provide a structured testbed and evaluation approach for investigating the impact of prompt framing on multi-step decision making and control tasks.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Anup Tuladhar

Eli Kinney‐Lang

Journals

SHILAP Revista de lepidopterología

Frontiers in Robotics and AI

Actions

Institutions

University of Calgary

Azrieli College of Engineering Jerusalem

Suntek Computer Systems (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A platform for investigating prompt framing as interface parameters in foundation models for robotics

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study