What question did this study set out to answer?

The central aim is to improve UAV object detection by using a multi-modal framework that addresses challenges in diverse environments.

April 22, 2026Open Access

DGE-YOLO: Dual-Branch Gathering and Attention for Efficient Accurate UAV Object Detection

Key Points

The central aim is to improve UAV object detection by using a multi-modal framework that addresses challenges in diverse environments.
Proposed DGE-YOLO, an enhanced YOLO-based framework for object detection.
Implemented a dual-branch architecture for modality-specific feature extraction and introduced an Efficient Multi-scale Attention module.
Replaced conventional neck with a Gather-and-Distribute module for better feature aggregation.
DGE-YOLO outperformed state-of-the-art baselines across all tested conditions.
Notable improvements observed in object detection accuracy and efficiency in challenging environments.

Abstract

The rapid proliferation of unmanned aerial vehicles (UAVs) has amplified the need for robust and efficient object detection in diverse aerial environments. However, detecting small objects under complex conditions (e.g., low illumination, cluttered backgrounds, and thermal–visual discrepancies) remains challenging. While many existing detectors emphasize real-time inference, they often rely on weak or late fusion strategies, resulting in suboptimal utilization of complementary multi-modal cues. To address this limitation, we propose DGE-YOLO, an enhanced YOLO-based framework for effective infrared–visible (IR–RGB) multi-modal fusion in UAV object detection. DGE-YOLO adopts a dual-branch architecture for modality-specific feature extraction, preserving modality-aware representations before fusion. To strengthen cross-scale semantics, we introduce an Efficient Multi-scale Attention (EMA) module that improves feature discrimination across spatial resolutions. Furthermore, we replace the conventional neck with a Gather-and-Distribute module to reduce information loss during feature aggregation and improve multi-scale feature propagation. Extensive experiments on the DroneVehicle dataset demonstrate that DGE-YOLO consistently outperforms state-of-the-art baselines, confirming its effectiveness and practicality as an applied multi-modal detection solution for UAV scenarios.

DGE-YOLO: Dual-Branch Gathering and Attention for Efficient Accurate UAV Object Detection

Key Points

Abstract

Cite This Study