In remote sensing image oriented object detection tasks, weakly supervised learning methods based on horizontal bounding boxes have attracted much attention due to their lower annotation costs compared to fully supervised methods. However, remote sensing images, characterized by complex backgrounds, exhibit a wide range of target scales and diverse geometric characteristics across target categories. Existing methods exhibit inadequate exploitation of background and angular information under weak supervision, resulting in compromised perception of dense and high-aspect-ratio targets. Neglecting the imbalance in angle estimation samples further leads to excessively low detection accuracy for few-shot categories. To address the aforementioned issues, this paper proposes a Geometry-Aware Enhancement Network (WSOOD-GAEN) for weakly supervised oriented object detection tasks. First, in the backbone network stage, a channel-space deformable attention module (DAE-ResNet) was constructed. Through deformable sampling and screening of key regions, feature extraction has both morphological adaptability to complex shapes and semantic discriminability of key features in complex backgrounds. Secondly, in the feature pyramid stage, an Angle-Guided Feature Pyramid Network (AG-FPN) is proposed. This module dynamically applies rotation transformation to the sampling offsets of deformable convolutions, thereby enhancing the feature representation of objects with different orientations and scales. Furthermore, an adaptive geometric perception loss (AGL) was designed. Based on the geometric characteristics of different categories, it automatically learns differentiated rotation and flip consistency weights, thereby improving the prediction accuracy of small sample categories. Experiments on the DOTA-v1.0, HRSC, and RSAR datasets validate our approach. Specifically, under the AP75 evaluation metric, the proposed method outperforms existing weakly supervised methods by 1.51%, 9.86%, and 3.28%, respectively.
Zhu et al. (Tue,) studied this question.