March 3, 2026Open Access

Partial observability in vision-based forklift navigation with compressed visual information

Key Points

The DRL agent reduces lateral mean absolute error by up to 78% with the direction estimation module.
Domain Randomization indicates the module's importance, especially with inaccurate bounding box detections.
Combining a privileged agent with an LSTM network accelerates the training process for navigation tasks.
The proposed method cuts down observation generation time by over 89%, improving overall system efficiency.

Abstract

Abstract In industry, transportation tasks are more frequently handled by mobile robotic systems such as Automated Guided Vehicles (AGVs). The deployment of such systems alongside humans entails the need to handle load carriers with imprecisely known or unknown position. In this work, we apply our previously introduced method to control a forklift AGV based on a single RGB camera and a Deep Reinforcement Learning (DRL) agent. This agent utilizes compressed visual information in form of bounding box data to perform the final approaching and precise alignment in front of these load carriers. Hereby, the limited field of view of the camera results in a partially observable environment state, a typical issue for vision-based vehicles. To address this problem, we propose a direction estimation module, which uses a Long Short-Term Memory (LSTM) network to keep track of previous interactions with the environment. We design the proposed module to provide an additional input for the agent, enabling an independent training and verification of the system components. Through this extension, our DRL agent achieves a reduction of the lateral mean absolute error of up to 78% compared to the DRL baseline without the direction estimation module. The application of Domain Randomization (DR) to investigate the influences of inaccurate bounding box detections revealed an even higher importance of the direction estimation module, if combined with an imprecise detector. We also apply two distinct methods to speed up our training process. Firstly, a privileged agent is employed to generate expert demonstrations for the training of the DRL agent and LSTM network. Secondly, we accelerate the generation of bounding box data through the projection of tightly-fitted 3D bounding boxes. This method reduces the time required for the generation of observations by more than 89%.

Bookmark

View Full Paper

Bookmark

View Full Paper

Partial observability in vision-based forklift navigation with compressed visual information

Key Points

Abstract

Cite This Study