This paper presents a ROS-based pipeline for reliable detection and robotic manipulation that combines classical computer vision with analytical grasp planning. The pipeline integrates RGB-based detection, 3D reconstruction, 6-DoF pose estimation, analytical grasp planning, and collision-free motion execution within a modular architecture. In contrast to purely learning-based approaches, the system emphasizes interpretability, deterministic performance, and deployment readiness without requiring additional data collection or neural-network retraining. Robustness to sensor noise, illumination variation, and partial occlusions is achieved through temporal stabilization of pose estimates and cross-modal consistency validation. The analytical grasp planner generates top and side grasp candidates and evaluates them based on force-closure stability, gripper constraints, inverse-kinematics feasibility, and collision avoidance in MoveIt. The selected grasp is executed through a structured pregrasp–grasp–lift–place sequence with recovery actions (re-detection, re-ranking, re-planning) in case of failure. The main contribution is a unified ROS architecture that closely links pose estimation with formal grasp-quality evaluation, providing deterministic operation, transparent decision-making, and easy adaptation to other prismatic objects and hardware platforms.
Kurmangalieva et al. (Thu,) studied this question.