In the intelligent tomato-picking scenario, challenges such as insufficient accuracy in recognizing the growth pose of target tomatoes and inaccurate positioning of picking and grasping points have led to low efficiency in automated picking. To address these issues, this paper introduces an object detection optimization model based on Yolov8s, termed YOLOv8S-ECC. The model focuses on “Judging tomato pose by the spatial vector of the relative position between the calyx and the center point of the fruit,” aiming to enhance high-precision positioning of both the tomato calyx and fruit, thereby laying the groundwork for subsequent pose judgment and picking point positioning. We have integrated the ECA (Efficient Channel Attention) and Coordinate attention mechanisms into the Backbone network and introduced the CBAM (Convolutional Block Attention Module) attention mechanism into the Neck network. The combined effect of these attention mechanisms effectively overcomes the recognition challenges posed by the calyx’s color texture, which closely resembles the environment. This integration has also enhanced the model’s robustness in complex field environments. Test results indicate significant improvements: the accuracy rate, recall rate, and mAP@50 for detecting tomato fruits and calyces are 81.7% and 87.5%, 92.7% and 85.9%, and 89.7% and 91.3%, respectively, compared to the original model. By encapsulating the algorithm and integrating it with the picking robot, tests in a simulated environment (different lighting conditions and foliage occlusion situations) show picking success rates of 93.02%, with an average picking operation time of 14.2 ± 0.855 s, including an image recognition and processing time of 0.035 s. This research offers an effective technical solution for high-precision visual perception and pose judgment in fruit and vegetable picking robots, contributing to improved quality in tomato industry picking operations.
Zhuang et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: