The performance of deep learning approaches for Synthetic Aperture Radar (SAR) target detection is often limited by the scarcity of annotated data. While Self-Supervised Learning (SSL) has emerged as a powerful paradigm to mitigate data dependence, its potential in SAR target detection remains largely underexplored. In this study, we propose SARDet-MIM, a comprehensive framework based on Masked Image Modeling (MIM), to enhance SAR target detection. The approach consists of two stages. In the self-supervised pre-training stage, we propose an innovative Structural and Scattering Masked Autoencoder (SSMAE) method for SAR imagery. Unlike conventional MIM methods, which typically reconstruct raw pixels, SSMAE employs a physics-aware reconstruction target comprising multi-scale gradient and SAR-Harris features. This strategy explicitly guides the network to capture discriminative structural contexts and intrinsic scattering features that benefit SAR target detection. For downstream detection, we construct a Maximally Pre-trained Detector (MPD), which integrally transfers the pre-trained ViT encoder–decoder architecture to the detection network to fully exploit pre-trained representations. Extensive experiments on three SAR target detection datasets demonstrate that SARDet-MIM consistently outperforms competing methods.
Zhou et al. (Fri,) studied this question.