What question did this study set out to answer?

The research aims to develop an autonomous system for mobile manipulators to detect and grasp objects based on natural language prompts.

May 9, 2026Open Access

Autonomous object detection and manipulation using a mobile cobot

Key Points

The research aims to develop an autonomous system for mobile manipulators to detect and grasp objects based on natural language prompts.
Designed a frontier-based exploration and grasp system without prior maps or object-specific training.
Implemented a lightweight vision-language model for object detection in real-time on embedded GPU hardware.
Evaluated the system in two indoor environments with varied exploration scenarios.
Frontier-based exploration reduced execution time and traveled path length compared to the baseline, with significant improvements seen in occluded environments.
Grasp success rates improved when navigating narrow passages using the proposed method.
The system demonstrated practical feasibility for real-time autonomous manipulation in resource-limited settings.

Abstract

Autonomous mobile manipulators operating in unknown environments must tightly couple exploration, perception, and manipulation under strict computational and sensing constraints. This paper presents a fully onboard exploration-to-grasp system that enables a mobile cobot to autonomously search for, detect, and grasp a target object specified by a natural-language prompt without prior maps or object-specific training. The proposed system integrates frontier-based exploration with camera-aware coverage planning to reduce redundant motion and promote informative viewpoints. Open-vocabulary object detection is performed using a lightweight vision-language model optimized for real-time inference on embedded GPU hardware. Upon stable detection, a deterministic detection-to-grasp pipeline computes feasible standoff poses and executes a constrained grasp sequence tailored to the target object geometry. The approach is evaluated in two real-world indoor environments with multiple exploration scenarios. Experimental results demonstrate that frontier-based exploration significantly outperforms a straight-line baseline in terms of execution time, traveled path length, and grasp success, particularly in environments with occlusions and narrow passages. The findings highlight the practical feasibility of integrating open-vocabulary perception and autonomous exploration for reliable mobile manipulation on resource-constrained cyber-physical systems.

Autonomous object detection and manipulation using a mobile cobot

Key Points

Abstract

Cite This Study

Also Consider

Also Consider