Accurate target perception in unstructured outdoor environments remains a fundamental challenge in computational imaging and machine vision, primarily due to severe optical degradation caused by variable illumination, specular highlights, and dense foliage occlusion. Existing optical sensing systems often struggle to maintain robustness under these physical constraints, especially when deployed on edge devices with strict computational limits. To address these challenges, this paper proposes Orchard-YOLO, a lightweight, computationally efficient object detection network designed to maintain robustness against environmental and optical noise in complex orchard environments. Unlike generic architectures, Orchard-YOLO introduces three architectural enhancements for robust detection: (1) a High-Resolution P2 Detection Head to preserve high-frequency optical details and fine-grained texture cues often lost during digital downsampling; (2) Coordinate Attention (CA) mechanisms integrated into the feature fusion pathway to filter out background optical interference and enhance spatial discrimination for heavily occluded targets; and (3) a Ghost-convolution-based backbone to optimize the inference pipeline for real-time edge processing. Evaluated on a comprehensive multi-fruit dataset under simulated optical stress (including ±50% illumination variation and up to 70% occlusion), Orchard-YOLO achieves 94.8% mAP@0.5. It shows improved robustness under illumination variation and occlusion compared to baseline models, while achieving up to 25 FPS on an NVIDIA Jetson Nano edge device. These results suggest that Orchard-YOLO offers a detection framework suitable for resource-constrained orchard perception.
Wang et al. (Mon,) studied this question.