What question did this study set out to answer?

This study aims to improve accuracy in recognizing tomato poses and precise localization of picking points for robotic harvesting.

March 19, 2026Open Access

A Detection Method for Tomato Pose Estimation and Grasping Point Localization in Robotic Harvesting Based on YOLOv8s-ECC

Puntos clave

This study aims to improve accuracy in recognizing tomato poses and precise localization of picking points for robotic harvesting.
Developed a YOLOv8s-based object detection model, YOLOv8S-ECC.
Integrated Efficient Channel Attention and Coordinate Attention mechanisms into the model.
Utilized Convolutional Block Attention Module in the Neck network.
Tested model accuracy in complex environments and different lighting conditions.
Achieved accuracy rates of 81.7% for detecting tomato fruits and 87.5% for calyces.
Reported a recall rate of 92.7% for fruits and 85.9% for calyces.
Achieved a mean Average Precision (mAP) of 89.7% for fruits and 91.3% for calyces.
Demonstrated a picking success rate of 93.02% with an average picking time of 14.2 seconds.

Resumen

In the intelligent tomato-picking scenario, challenges such as insufficient accuracy in recognizing the growth pose of target tomatoes and inaccurate positioning of picking and grasping points have led to low efficiency in automated picking. To address these issues, this paper introduces an object detection optimization model based on Yolov8s, termed YOLOv8S-ECC. The model focuses on “Judging tomato pose by the spatial vector of the relative position between the calyx and the center point of the fruit,” aiming to enhance high-precision positioning of both the tomato calyx and fruit, thereby laying the groundwork for subsequent pose judgment and picking point positioning. We have integrated the ECA (Efficient Channel Attention) and Coordinate attention mechanisms into the Backbone network and introduced the CBAM (Convolutional Block Attention Module) attention mechanism into the Neck network. The combined effect of these attention mechanisms effectively overcomes the recognition challenges posed by the calyx’s color texture, which closely resembles the environment. This integration has also enhanced the model’s robustness in complex field environments. Test results indicate significant improvements: the accuracy rate, recall rate, and mAP@50 for detecting tomato fruits and calyces are 81.7% and 87.5%, 92.7% and 85.9%, and 89.7% and 91.3%, respectively, compared to the original model. By encapsulating the algorithm and integrating it with the picking robot, tests in a simulated environment (different lighting conditions and foliage occlusion situations) show picking success rates of 93.02%, with an average picking operation time of 14.2 ± 0.855 s, including an image recognition and processing time of 0.035 s. This research offers an effective technical solution for high-precision visual perception and pose judgment in fruit and vegetable picking robots, contributing to improved quality in tomato industry picking operations.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo