• Introducing a lightweight Edge-YOLOv11 framework for UAV orchard monitoring • Designing novel modules to overcome foliage occlusion and extreme scale variations • Achieving real-time robust perception on resource-constrained edge devices The transition toward precision agriculture demands highly efficient methods for crop monitoring. However, deploying deep learning models on unmanned aerial vehicles (UAVs) in complex orchard environments poses critical challenges, notably small target sizes, severe foliage occlusion, and the strict computational constraints of edge devices. To address these bottlenecks, this study proposes Edge-YOLOv11, a lightweight detection framework explicitly optimized for UAV-based litchi perception. The architecture synergizes three strategic innovations: First, a Multi-Scale Re-parameterized Encoding module constructs dynamic hierarchical receptive fields to robustly capture extreme scale variations. Second, a Spatial-Attention Partial Feature Fusion strategy resolves the conflict between feature redundancy and semantic integrity via a reduce-and-recover paradigm. Finally, an Occlusion-Aware Semantic Alignment Head explicitly disentangles densely clustered targets, effectively mitigating the adverse effects of foliage and mutual fruit occlusion. Extensive experiments on a dedicated UAV litchi dataset demonstrate the framework’s superiority. Edge-YOLOv11 achieves a mean average precision (mAP) of 90.1% and an F1-score of 85.5%, outperforming multiple state-of-the-art baselines. Crucially, the model maintains a highly compact parameter size of merely 6.35 MB. When deployed on a Jetson Orin Nano edge device with TensorRT optimization, it sustains a real-time inference speed of 26.52 FPS while consuming only 135 MB of GPU memory. Furthermore, supplementary evaluations on public tomato and citrus datasets confirm the model’s robust generalization capabilities, highlighting its significant potential as a versatile perception solution for diverse crop detection tasks in precision agriculture.
Peng et al. (Wed,) studied this question.