ABSTRACT Unmanned Aerial Vehicle Object Detection (UAV‐OD) is crucial for applications such as surveillance and environmental monitoring, where balancing high detection precision with real‐time inference speed remains a significant challenge. However, detecting small targets remains difficult due to complex backgrounds and weakened boundary cues in existing real‐time frameworks. To address these limitations, we propose EAF‐DETR, an enhanced real‐time Detection Transformer built upon the RT‐DETR architecture, designed to achieve a superior trade‐off between accuracy and efficiency. Specifically, we design a PolyEdge Integrator (PEI) to strengthen boundary‐aware representations. Within PEI, an Edge Feature Refinement Block (EFRB) explicitly extracts and enhances high‐frequency edge cues, while a Peak Squeeze and Excitation (PeakSE) module utilizes a saliency‐based max‐pooling mechanism to adaptively focus on object boundaries. Furthermore, we introduce an Adaptive Weighted Feature Fusion (AWFF) module. By integrating wavelet‐based feature decomposition, AWFF performs efficient cross‐scale fusion that emphasizes informative regions and preserves fine details without introducing excessive computational overhead. Extensive experiments on the VisDrone‐2019 and UAVVaste datasets demonstrate the effectiveness of EAF‐DETR. On VisDrone‐2019, EAF‐DETR‐R18 achieves an Average Precision (AP) of 30.6, surpassing RT‐DETR‐R18 by 3.9 points. On UAVVaste, it reaches 45.5 AP, significantly outperforming the baseline's 36.3 AP. These results verify that EAF‐DETR establishes a new state‐of‐the‐art balance by significantly improving small‐object detection robustness while strictly retaining real‐time inference capability.
Shi et al. (Sun,) studied this question.