What question did this study set out to answer?

The aim is to develop a reliable ROS-based pipeline for robotic box detection and manipulation, combining classical computer vision with analytical grasp planning.

June 4, 2026Open Access

ROS-Based Automation of Box Detection and Manipulation Using Computer Vision and Analytical Grasp Planning

Key Points

The aim is to develop a reliable ROS-based pipeline for robotic box detection and manipulation, combining classical computer vision with analytical grasp planning.
Developed a modular ROS architecture integrating RGB detection, 3D reconstruction, and 6-DoF pose estimation.
Implemented analytical grasp planning with stability evaluation and collision avoidance in MoveIt.
Executed the grasp with a structured sequence including recovery actions like re-detection and re-planning.
The system effectively handles sensor noise and illumination variations, ensuring reliable manipulation.
Top and side grasp candidates generated were evaluated with respect to stability and feasibility, leading to successful executions.
Demonstrated deterministic performance and easy adaptability to various robotic platforms.

Abstract

This paper presents a ROS-based pipeline for reliable detection and robotic manipulation that combines classical computer vision with analytical grasp planning. The pipeline integrates RGB-based detection, 3D reconstruction, 6-DoF pose estimation, analytical grasp planning, and collision-free motion execution within a modular architecture. In contrast to purely learning-based approaches, the system emphasizes interpretability, deterministic performance, and deployment readiness without requiring additional data collection or neural-network retraining. Robustness to sensor noise, illumination variation, and partial occlusions is achieved through temporal stabilization of pose estimates and cross-modal consistency validation. The analytical grasp planner generates top and side grasp candidates and evaluates them based on force-closure stability, gripper constraints, inverse-kinematics feasibility, and collision avoidance in MoveIt. The selected grasp is executed through a structured pregrasp–grasp–lift–place sequence with recovery actions (re-detection, re-ranking, re-planning) in case of failure. The main contribution is a unified ROS architecture that closely links pose estimation with formal grasp-quality evaluation, providing deterministic operation, transparent decision-making, and easy adaptation to other prismatic objects and hardware platforms.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper