What question did this study set out to answer?

The research aims to enhance image super-resolution by developing a new adaptive sparse self-attention method.

March 8, 2026

Adaptive Sparse Self-Attention for Efficient Image Super-resolution and beyond

Key Points

The research aims to enhance image super-resolution by developing a new adaptive sparse self-attention method.
Developed local spatial variant feature estimation for self-attention queries and keys.
Introduced adaptive selection of similarity values in the self-attention matrix.
Analyzed both local and non local features for better structural restoration.
The proposed method shows improved accuracy in image super-resolution.
Demonstrated lower model complexity compared to state-of-the-art self-attention methods.

Abstract

Benefiting from the effectiveness of the self-attention mechanisms in the Transformer framework for modeling non local features of images, significant progress has been achieved in image super-resolution. We note that existing self-attention mechanisms usually explore all similarities of the tokens between the queries and keys for the feature aggregation. However, using all the similarities does not effectively facilitate the high-quality image reconstruction as not all the tokens from the queries are relevant to those in keys. We further note that self-attention mechanisms are less effective for local feature exploration, which are less effective for the structural detail restoration. To overcome these problems, we develop a simple yet effective adaptive sparse self-attention method to utilize the most useful information of tokens for image restoration. We first develop a local spatial variant feature estimation method to build the query and key used in the self-attention so that local information can be better modeled. Then, we present a simple yet effective sparse self-attention to adaptively select the most useful similarity values from the self-attention matrix for better the feature aggregation. We analyze that the proposed method models both local and non local features and thus facilitates better structural detail restoration. We further show that the proposed method can serve as an alternative to existing self-attention mechanisms for better image restoration. Experimental results show that the proposed method performs favorably against state-of-the-art ones on benchmark datasets in terms of accuracy and model complexity.

Bookmark

Cite This Study

pan et al. (Thu,) studied this question.

synapsesocial.com/papers/69ada8b2bc08abd80d5bbea1 https://doi.org/https://doi.org/10.1109/tpami.2026.3670856

Bookmark