What question did this study set out to answer?

The research aims to enhance weed-crop detection accuracy in UAV imagery while managing efficiency under budget constraints.

May 15, 2026Open Access

D2FNet: A Lightweight Dual-Driven Texture–Semantic Fusion Network for Fine-Grained Real-Time UAV Weed–Crop Detection

Puntos clave

The research aims to enhance weed-crop detection accuracy in UAV imagery while managing efficiency under budget constraints.
Proposed D2FNet incorporates Texture–Semantic Backbone, lightweight MCF-A2C2f operator, and cross-scale adaptive fusion via DSSA-Head.
PSBL downweights low-quality positives during training to stabilize model performance.
Evaluated on WeedCrop Image Dataset and Sesame Crop & Weed Dataset.
D2FNet-n improves mAP50--95 from 36.6% to 44.1% (+7.5%) on WeedCrop Image Dataset.
On Sesame Crop & Weed Dataset, mAP50--95 increases from 62.2% to 70.1% (+7.9%).
Demonstrates stable accuracy gains without increasing model size, showing cross-dataset robustness.

Resumen

Weed–crop object detection in UAV field imagery faces several significant challenges, including a large proportion of small objects, dense occlusions, similar texture appearance, and strong background interference. These challenges often lead to missed detections, localization drift, and unstable training under edge-device budget constraints. To improve detection accuracy while maintaining a practical accuracy–efficiency trade-off in complex farmland scenes, we propose the Dual-Driven Texture–Semantic Fusion Network (D2FNet), consisting of a Texture–Semantic Backbone (TSB), an efficient operator MCF-A2C2f, a cross-scale adaptive fusion and feature redistribution module DSSA-Head, and a scale-aware reweighting block PSBL. TSB reduces discriminative ambiguity caused by similar weed–crop appearance and complex background textures; MCF-A2C2f controls the additional cost of the dual-driven design via lightweight operator substitution while largely preserving per-scale representations; DSSA-Head addresses multi-scale representation inconsistency induced by abundant small objects and large scale variation in field scenes; PSBL downweights low-quality positives by sample quality to stabilize box regression and training. Experimental results show that on the WeedCrop Image Dataset, D2FNet-n improves mAP50--95 from 36.6% to 44.1% (+7.5%) over the baseline YOLOv12-n; on the auxiliary Sesame Crop & Weed Dataset, mAP50--95 increases from 62.2% to 70.1% (+7.9%). These results indicate that D2FNet achieves stable accuracy gains under comparable parameter and computation budgets, rather than pursuing the smallest absolute model size, and shows promising cross-dataset robustness on the evaluated benchmarks.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo