What question did this study set out to answer?

The aim is to improve blind image restoration using a sparse transformer that effectively handles degradation information.

June 14, 2026

WSformer: Wavelet-based Sparse Transformer for Blind Image Restoration

Key Points

The aim is to improve blind image restoration using a sparse transformer that effectively handles degradation information.
Developed WSformer with Sparse Reciprocal Multi-head Self-Attention (SR-MSA) to enhance information selection.
Created a Recalibrated Feed-Forward Network (RFFN) to leverage global information more effectively.
Utilized wavelet transform to reduce computational burden while maintaining performance.
WSformer significantly outperforms existing methods in quantitative metrics and visual quality in various BIR tasks.

Abstract

As a fundamental task in image processing, blind image restoration (BIR) faces significant challenges due to the unknown nature of the degradation process. While transformer-based methods have shown promise in various applications, they encounter difficulties in BIR. One key challenge is that the complexity of degradation easily leads to incorporate irrelevant information into their attention mechanisms, thereby hindering restoration performance. To address this challenge, sparsification strategies have been commonly adopted. However, existing sparse transformer-based methods typically determine sparse members through fixed patterns such as constant thresholds or predefined sources, making their sparsification strategies too rigid. To tackle this issue, we propose WSformer, a Wavelet-based Sparse transformer tailored for BIR, which offers three key advantages. First, we design a Sparse Reciprocal Multi-head Self-Attention (SR-MSA) mechanism in the attention layer. This mechanism employs sparse and reciprocal strategies to adaptively select reliable information, while operating across channels to reduce computational complexity. Second, recognizing that feed-forward networks in existing transformer blocks fail to effectively leverage global information, we develop a Recalibrated Feed-Forward Network (RFFN). It fully exploits the fusion of local and global information, enhancing the robustness of feature learning. Finally, to mitigate the increased computational burden introduced by these innovations, we equip WSformer with wavelet transform. Combined with a U-shaped architecture, it enables WSformer to achieve an optimal balance between performance and inference time. Extensive experiments on multiple BIR tasks validate WSformer's effectiveness in both quantitative metrics and visual quality. The code is available at https://github.com/CanZhang01/WSformer.

Bookmark

WSformer: Wavelet-based Sparse Transformer for Blind Image Restoration

Key Points

Abstract

Cite This Study

Also Consider

Also Consider