Hyperspectral imagery (HSI) is often affected by various types of noise during acquisition, which can significantly impair subsequent applications. Deep learning-based methods, particularly transformer-based and diffusion model-based approaches, have emerged as effective techniques for HSI denoising. However, transformer-based methods typically rely on single-step denoising approaches, which are insufficient to preserve the fine-grained details of the original HSI throughout the computation process. On the other hand, diffusion model-based methods, which are limited by network structures that focus on local information, lack the modeling of spatial non-local similarity and spectral information in HSI, resulting in significant spectral distortion. To address these challenges, we propose a novel spatial-spectral transformer-based diffusion model (S2TDM) for HSI denoising, which combines the advantages of both transformer and diffusion models. S2TDM decomposes the denoising process into multiple iterative time steps, each guided by the original HSI, allowing the denoising network to refine the image initially corrupted by Gaussian noise progressively. To better capture the spatial and spectral features of HSI, we propose a spatial-spectral transformer-based denoising network that utilizes transformers in the spatial dimension, spectral dimension, and an additional time step dimension to more comprehensively capture the spatial similarity and spectral correlation of HSI. Experimental results on synthetic and real HSI datasets demonstrate that S2TDM outperforms the state-of-the-art HSI denoising methods across multiple evaluation metrics. The code will be available at the following website: https://github.com/zhehui-wu/S2TDMhttps://github.com/zhehui-wu/S2TDM.
Zhe-hui et al. (Wed,) studied this question.