Object detection in UAV scenarios is frequently compromised by drastic scale fluctuations and pervasive background clutter. We propose Aero-DETR, featuring a Multi-Scale Perception Stem (MSPS) for early feature adaptation, Global-Spatial Synergistic Attention (GSSA) to suppress noise, and Partitioned Spatial-Adaptive Fusion (PSAF) to mitigate information decay. These modules synergistically enhance foreground saliency and restore geometric details lost during pyramid aggregation. Experiments on VisDrone and SIMD datasets show that Aero-DETR achieves superior detection accuracy while increasing parameters, GFLOPs, and inference latency, indicating an explicit accuracy-efficiency trade-off.
Zhang et al. (Thu,) studied this question.