Ship detection in Synthetic Aperture Radar (SAR) imagery plays a critical role in maritime surveillance applications, ensuring security and defense, and the management of territorial waters. This task, however, remains challenging due to the complex characteristics of SAR data, including strong background noise, side-lobe effects, and ambiguous target signals. In this context, deep learning methods, particularly real-time object detection architectures like You Only Look Once (YOLO), have shown considerable potential. Nevertheless, their effectiveness on SAR imagery is still limited by suboptimal feature extraction and performance reduction in high-noise environments. This paper proposes an improved version of the YOLOv12n architecture, which integrates an M-MBConvBlock module (an enhanced variant of the MBConvBlock from EfficientNet) into the backbone to enhance representational capacity and adapt to SAR images. Additionally, the loss function is refined by replacing the Complete Intersection over Union (CIoU) with an I-ShapeIoU (improved Shape Intersection over Union) to optimize localization accuracy. Empirical validation demonstrates that the proposed architecture achieves a compelling accuracy of 90.1% mAP@0.5 while maintaining exceptional computational efficiency. Crucially, SSD-YOLOv12 accomplishes this with a mere 1.16 million parameters and a compact 2.8 MB memory footprint, a substantial reduction compared to contemporary YOLO variants such as YOLOv8n (3.01M parameters, 6.3 MB) and the YOLOv12n baseline (2.56M parameters, 5.5 MB). This synergy between high precision and model compactness validates its suitability for real-time ship detection in SAR imagery, particularly on resource-constrained platforms.
Nguyen et al. (Sun,) studied this question.