Semantic change detection (SCD) aims to identify land cover changes between bi-temporal remote sensing images and plays a crucial role in applications, such as urban monitoring and disaster assessment. Most existing methods rely on a shared-weight encoder with a dedicated change extractor, which limits their ability to precisely localize changed regions and maintain intra-class semantic consistency. Although vision foundation models (VFMs) exhibit strong generalization ability, their potential for SCD remains largely underexplored. In this paper, we introduce ChangeVFM, a novel yet effective framework that unleashes the power of vision foundation models for semantic change detection in remote sensing images. This framework is supplemented by a spatiotemporal modeling module to capture fine-grained spatial details and a query-based feature injector that integrates VFM’s semantic priors with multi-scale spatiotemporal features. The feature injector ensures that the ChangeVFM excels in both maintaining semantic consistency and multi-scale information of bi-temporal images. Without bells and whistles, ChangeVFM achieves competitive performance on the HRSCD, SECOND, and Landsat-SCD. Comprehensive quantitative and qualitative experiments further validate the effectiveness of the introduced modules and the robustness of the proposed method.
Huang et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: