What question did this study set out to answer?

This research aims to enhance the efficiency of task planning for reinforcement learning agents by utilizing LLM-based action masking.

May 30, 2026

Enhancing Task Planning Efficiency of Reinforcement Learning Agents through LLM-based Action Masking

Key Points

This research aims to enhance the efficiency of task planning for reinforcement learning agents by utilizing LLM-based action masking.
Utilized a two-phase structure with a learning phase without masking to acquire a robust policy.
Applied LLM-generated action masks in the validation phase to guide the exploration process.
Focused on autonomous excavation control to evaluate the proposed method.
Achieved a 16.9% improvement in success rate and 38.6% in spatial accuracy compared to baseline reinforcement learning without masking.
Showed a 10.5% success rate improvement and 31.2% in spatial accuracy over rule-based masking.
The efficiency of the proposed method significantly increased with larger target area sizes, validating the effectiveness of adaptive geometric decomposition and dynamic action masking.

Abstract

복잡한 순차적 작업 계획에서 강화학습 에이전트는 대규모 행동 공간으로 인한 탐색 비효율성 문제에 직면한다. 본 논문은 대규모 언어 모델(large language model, LLM)을 활용한 행동 마스킹 기법을 제안하여 이러한 문제를 해결한다. 제안된 방법은 학습-검증 이원화 구조를 채택하여, 학습 단계에서는 마스킹 없이 강건한 정책을 획득하고, 검증 단계에서는 LLM이 목표 기하학을 분석하여 생성한 단계별 행동 마스크를 적용한다. 자율 굴착 제어를 대상으로 한 실험 결과, 제안 방법은 마스킹이 없는 기본 강화학습 대비 성공률 16.9%, 공간 정확도 38.6 % 개선을 달성했으며, 규칙 기반 마스킹 대비 성공률 10. 5%, 공간 정확도 31.2 % 개선을 보였다. 특히 타겟 영역 크기가 증가할수록 제안 방법의 우위가 더욱 명확해졌으며, 이는 LLM의 적응적 기하학적 분해와 동적 행동 마스킹이 복잡도 증가에 효과적으로 대응함을 입증한다. 본 연구는 LLM의 추론 능력을 강화학습의 탐색 과정에 효과적으로 통합함으로써, 복잡한 순차적 작업 계획 문제에서의 학습 효율성과 성능을 동시에 향상시킬 수 있음을 보여준다.

KI fragen

Bookmark

Cite This Study

Cho et al. (Thu,) studied this question.

synapsesocial.com/papers/6a1a82640307b78509434136 https://doi.org/https://doi.org/10.5573/ieie.2026.63.5.120

KI fragen

Bookmark