This study aims to address the challenge of accurately localizing picking points on mature daylily buds under complex field conditions with diverse lighting, arbitrary angles, and different bud stages. The difficulty is exacerbated by the morphological similarity of perianth tubes across stages, given that the perianth tube is the grasping region for harvesting. A multi-stage pipeline is proposed that integrates a Convolutional Block Attention Module (CBAM)-enhanced Faster R-CNN detector to simultaneously identify mature buds and perianth tubes, followed by a post-detection judgment module to select only the tubes associated with mature buds. An RGB-D camera is used to transform the 2D picking point from the center of the perianth tube bounding box to 3D coordinates in the camera coordinate system. Experimental results show that the proposed method improved Average Precision (AP) of perianth tube associated with mature bud by 29.72% over initial Faster R-CNN, and achieved a 2D localization accuracy of 98.70% and 3D localization accuracy of 90.79% under an allowable tolerance of ±5 mm, and thus an overall picking point localization accuracy of 71.88%. This study provides a theoretical basis and data support for visual localization of a robotic daylily harvester.
Feng et al. (Sun,) studied this question.