Multi-platform aerial remote sensing supports critical applications including wide-area surveillance, traffic monitoring, maritime security, and search and rescue. However, constrained by observation altitude and sensor resolution, targets inherently exhibit small-scale characteristics, making small object detection a fundamental bottleneck. Aerial remote sensing faces three unique challenges: (1) spatial heterogeneity of modality reliability due to scene diversity and illumination dynamics; (2) conflict between precise localization requirements and progressive spatial information degradation; (3) annotation ambiguity from imaging physics conflicting with IoU-based training. This paper proposes RAPT-Net with three core modules: MRAAF achieves scene-adaptive modality integration through two-stage progressive fusion; CMFE-SRP employs hierarchy-specific processing to balance spatial details and semantic enhancement; DS-STD increases positive sample coverage to 4× through spatial tolerance expansion. Experiments on VEDAI (satellite) and RGBT-Tiny (UAV) demonstrate mAP values of 62.22% and 18.52%, improving over the state of the art by 4.3% and 10.3%, with a 17.3% improvement on extremely tiny targets.
Building similarity graph...
Analyzing shared references across papers
Loading...
Peida Zhou
Xiaojun Guo
Xiaoyong Sun
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhou et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69810013c1c9540dea81326d — DOI: https://doi.org/10.3390/rs18030449