What question did this study set out to answer?

The goal is to develop a diffusion model that efficiently adapts parameters for generating multispectral remote sensing images while addressing data scarcity and annotation costs.

March 18, 2026

多光谱遥感图像可控生成的扩散模型参数高效适配与光谱一致性学习

Key Points

The goal is to develop a diffusion model that efficiently adapts parameters for generating multispectral remote sensing images while addressing data scarcity and annotation costs.
Implemented parameter-efficient fine-tuning strategies in a frozen pre-trained diffusion model.
Introduced physical constraints based on spectral characteristics during the adaptation training.
Designed a text-perceptual encoding mechanism for accurate control over spatial distribution of land cover.
Conducted experiments on datasets like FLAIR and Five-Billion-Pixels for validation.
Demonstrated enhanced spectral fidelity and semantic alignment compared to mainstream models like ControlNet and T2I-Adapter.
Generated near-infrared bands with clear physical significance.
Achieved accuracy improvement in downstream tasks using generated data for training.

Abstract

针对深度学习在多光谱遥感应用中面临的数据获取困难与标注成本高昂问题，以及现有基础生成模型难以直接适配多光谱数据且从零训练计算开销巨大的现状，提出了一种面向多光谱遥感图像生成的参数高效适配扩散模型。该方法采用参数高效微调策略，通过在冻结的预训练扩散模型中嵌入各种低参数微调模块，不同于通用可控生成方法仅以数据驱动方式建模图像，本文在适配训练中显式引入遥感光谱物理约束，并针对地物语义-空间映射设计了文本感知编码机制。实现了从RGB图像域向四波段（RGB+NIR）图像域的低成本迁移，不同微调模块综合了光谱与空间纹理适配。在此基础上，引入基于归一化植被指数（Normalized Difference Vegetation Index， NDVI）和归一化水体指数（Normalized Difference Water Index， NDWI）的物理一致性损失，强制约束红光与近红外波段间的光谱相关性。此外，提出文本感知空间语义编码机制，利用语义分割掩膜实现对地物空间布局的精确控制。在FLAIR、Five-Billion-Pixels及IRSAMap等数据集上的实验表明，与ControlNet、T2I-Adapter等主流方法相比，本文方法在光谱保真度与语义对齐度上均有所提升，生成的近红外波段具备明确的物理意义。此外，利用生成数据辅助训练在下游开放词汇分割任务上取得了一定的精度提升，验证了该方法作为数据增强手段的可行性。本框架有效解决了RGB基础模型向多光谱遥感领域迁移时的通道不匹配与物理特征丢失问题，实现了低资源消耗下的高质量、可控多光谱数据生成，为缓解遥感解译任务中的数据稀缺问题提供了有效的数据增强方案。

Bookmark

View Full Paper

Bookmark

View Full Paper

多光谱遥感图像可控生成的扩散模型参数高效适配与光谱一致性学习

Key Points

Abstract

Cite This Study