Accurate detection of green walnuts in complex orchard environments is critical for automated yield estimation yet remains challenging due to extreme color similarity with foliage, dense distribution, frequent occlusion, and very small apparent size in UAV imagery. Existing methods rely solely on RGB input, lacking robustness under variable field conditions. This paper proposes GeoFuse-YOLO, a lightweight multi-source fusion network integrating UAV-acquired RGB and near-infrared imagery with six terrain-inspired geometric prior features derived from RGB images via classical differential operators. Built upon YOLO11, the architecture features differentiated channel allocation, two-stage progressive fusion with zero-initialized gating, a P2–P4 detection hierarchy, and two novel modules—R 2 -SPDConv and CSP-LOK—for small-target refinement. On a self-collected dataset (700 triplets, 73 627 instances, 100% small targets), GeoFuse-YOLO achieves 78.0% mAP 50 and 29.5% mAP 50: 95 with only 1.02 M parameters, surpassing the best single-modality model by +13.4% mAP 50 while being 6 × lighter. Comprehensive evaluation through progressive ablation, geometric channel analysis, fusion strategy comparison, five-fold cross-validation ( ± 0.30%), Grad-CAM visualization, and Raspberry Pi 5 deployment (277 ms/image) validates the approach’s effectiveness, robustness, and practical deployability. The code is available at https://github.com/LiuShuangYao/GeoFuse-YOLO .
Liu et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: