What question did this study set out to answer?

April 1, 2026Open Access

Unsupervised Change Detection in Heterogeneous Remote Sensing Images via Dynamic Mask Guidance

Key Points

The aim is to improve unsupervised change detection in heterogeneous remote sensing images by addressing sensor discrepancies.
Developed MaskUCD framework for unsupervised change detection.
Implemented dynamic mask-driven constraint scheduling.
Employed a spatially adaptive optimization mechanism with an asymmetric autoencoder.
Utilized multi-scale frequency analysis and global context modeling for feature representation.
MaskUCD achieved state-of-the-art performance in change detection tasks.
Demonstrated superior robustness compared to conventional methods.
Generated refined and semantically consistent masks for better feature divergence.

Abstract

Unsupervised change detection (CD) in heterogeneous remote sensing images is intrinsically difficult due to severe sensor-specific discrepancies. In the absence of ground truth, these discrepancies result in ambiguous optimization objectives that make it difficult for models to distinguish true land-cover changes from modality-driven pseudo-changes. To address these challenges, we propose MaskUCD, a novel unsupervised framework that reformulates heterogeneous CD as a dynamic mask-driven constraint scheduling problem. Fundamentally distinct from conventional strategies that enforce selective feature alignment, MaskUCD employs a spatially adaptive optimization mechanism. Specifically, the iteratively refined mask serves as a geometric reference to guide optimization. It enforces strict feature alignment in mask-unchanged regions to suppress modality-induced discrepancies, while simultaneously promoting feature divergence in mask-changed regions to emphasize semantic inconsistencies. In this way, explicit optimization objectives are established, together with an intrinsic interpretability constraint that guides the CD process. This strategy treats the mask as a structural guide for representation learning rather than a ground-truth reference, thereby avoiding error accumulation caused by directly using inaccurate masks as supervisory signals. To facilitate this optimization, we design a specialized asymmetric autoencoder with a hybrid encoder architecture, utilizing multi-scale frequency analysis and global context modeling to enhance feature representation capabilities. Consequently, this design enables the generation of refined and semantically consistent masks, which provide increasingly precise structural guidance, yielding converged and discriminative difference maps. Extensive experiments demonstrate that MaskUCD achieves state-of-the-art performance and superior robustness compared to existing advanced methods.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Xie et al. (Sun,) studied this question.

synapsesocial.com/papers/69ccb72e16edfba7beb89077 https://doi.org/https://doi.org/10.3390/rs18071022

Bookmark

View Full Paper