The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing deep learning-based detectors typically suffer from two fundamental limitations: the irreversible loss of fine-grained spatial details during hierarchical feature fusion and the scale-insensitive optimization of conventional loss functions, which inadequately emphasize hard-to-detect small targets. To address these issues, we propose a novel Spatial-Semantic Aggregation and Balancing Network (SSABNet) tailored for UAV-based small-target detection. First, a Spatial-Semantic Aggregation (SSA) module is introduced to establish a high-fidelity restoration pathway that recovers fine-grained texture and boundary information from shallow layers. By employing content-aware operators, SSA effectively reconciles the structural discrepancy between spatial details and semantic abstractions, enabling precise cross-scale feature fusion while suppressing aliasing artifacts. Second, we design a Scale-Aware Balancing Loss (SABL) to mitigate the gradient instability and vanishing-gradient issues commonly encountered when optimizing non-overlapping small targets. SABL adopts a scale-dependent modulation mechanism that smoothly transitions from Wasserstein distance for distributional alignment of small objects to Euclidean distance for geometric refinement of larger targets, thereby ensuring stable and balanced optimization across object scales. Extensive experiments on the VisDrone benchmark demonstrate that SSABNet outperforms state-of-the-art detectors, achieving gains of 1.3% in overall AP and 2.5% in APs. Further evaluation on the UAVDT dataset confirms its strong generalization capability, yielding improvements of 0.5% in AP and 16.9% in APs. These results validate the effectiveness of jointly addressing feature representation and scale-aware optimization for UAV small-target detection.
Zhang et al. (Mon,) studied this question.