What question did this study set out to answer?

The aim is to develop a real-time grasping framework for robots with limited computational resources.

January 21, 2026Open Access

Real-Time Target-Oriented Grasping Framework for Resource-Constrained Robots

Key Points

The aim is to develop a real-time grasping framework for robots with limited computational resources.
Implement click-based and category-based grasping for object detection.
Compress YOLOv8 using structured pruning to reduce model complexity.
Utilize pretrained GR-ConvNetv2 for predicting grasp poses.
Use MobileSAMv2 to generate masks that restrict grasp poses to target objects.
Incorporate a geometry-based correction module to enhance grasp accuracy.
Achieved a grasp success rate of 98.8% on the Cornell dataset and 95.8% on the Jacquard dataset.
Demonstrated over 90% success rates in real-world single-object and cluttered scenarios.
Maintained real-time performance with processing times of 67 ms and 75 ms per frame.

Abstract

Target-oriented grasping has become increasingly important in household and industrial environments, and deploying such systems on mobile robots is particularly challenging due to limited computational resources. To address these limitations, we present an efficient framework for real-time target-oriented grasping on resource-constrained platforms, supporting both click-based grasping for unknown objects and category-based grasping for known objects. To reduce model complexity while maintaining detection accuracy, YOLOv8 is compressed using a structured pruning method. For grasp pose generation, a pretrained GR-ConvNetv2 predicts candidate grasps, which are restricted to the target object using masks generated by MobileSAMv2. A geometry-based correction module then adjusts the position, angle, and width of the initial grasp poses to improve grasp accuracy. Finally, extensive experiments were carried out on the Cornell and Jacquard datasets, as well as in real-world single-object, cluttered, and stacked scenarios. The proposed framework achieves grasp success rates of 98.8% on the Cornell dataset and 95.8% on the Jacquard dataset, with over 90% success in real-world single-object and cluttered settings, while maintaining real-time performance of 67 ms and 75 ms per frame in the click-based and category-specified modes, respectively. These experiments demonstrate that the proposed framework achieves high grasping accuracy and robust performance, with a efficient design that enables deployment on mobile and resource-constrained robots.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper