High-resolution remote sensing image change detection holds significant application value in the fields of urban planning, disaster assessment, and others. However, it faces the dual challenge of pseudo-change interference and loss of detailed information. To address these issues, a frequency-domain-aware Siamese detail recovery network (FSD-Net) is designed in this paper. Firstly, from the perspective of frequency domain analysis, a theory on the dual roles of frequency domain components is introduced to reveal the robustness of low-frequency components to pseudo-changes and the dual semantic noise attributes of high-frequency components. Based on this theory, a frequency-aware context-guided difference (FCGD) module is designed. By explicitly decoupling the difference features into low-frequency global components and high-frequency residual components, it utilizes the prior low-frequency scene as a semantic gate to adaptively modulate the high-frequency differences, which effectively suppress pseudo-change interference. Subsequently, a detail recovery block (DRB), based on sub-pixel convolution, is constructed. This achieves unbiased spatial rearrangement through the semantic redundancy of channel dimensions, which avoids the checkerboard artifacts of traditional upsampling, and by employing a progressive multi-stage upsampling strategy to integrate shallow detail features from the encoder. The experimental results on the three public datasets of LEVIR-CD, WHU-CD, and CDD-CD demonstrate that the FSD-Net outperforms current mainstream methods (e.g., ChangeFormer, BAN, and so on) in core metrics such as F1 score and IoU, with a particularly significant improvement in recall. The ablation experiments validate the effectiveness and complementarity of the FCGD and DRB. Parameter sensitivity analysis indicates that the auxiliary loss weight λ is dataset dependent, with λ = 0.1 serving as a robust default choice. This study provides an efficient and reliable solution for change detection in high-resolution remote sensing imagery.
Li et al. (Sun,) studied this question.