What type of study is this?

This is a Experimental Study study.

September 16, 2025Open Access

EDTST: Efficient Dynamic Token Selection Transformer for Hyperspectral Image Classification

Key Points

EDTST achieves the highest classification accuracy on hyperspectral images, improving overall accuracy by 3%.
The model's dynamic token pruning reduces computational load, focusing on significant spectral-spatial features.
Extensive experiments on four datasets confirm EDTST's superior performance and reduced training time compared to previous models.
This architecture addresses the limitations of convolutional networks in capturing long-range dependencies.

Abstract

Hyperspectral images, characterized by rich spectral information, enable precise pixel-level classification and are thus widely employed in remote sensing applications. Although convolutional neural networks (CNNs) have demonstrated effectiveness in hyperspectral image processing, their limited receptive fields constrain their capacity to capture long-range dependencies. Transformers excel at modeling long-range features for hyperspectral image classification (HSIC). Yet, they often overlook effective representation of local spectral–spatial characteristics while incurring computational redundancy from numerous classification-irrelevant tokens. To address these challenges, we propose EDTST, a state-of-the-art Vision Transformer architecture specifically designed for efficient hyperspectral image classification. The model utilizes a large-kernel 3D convolution block to extract deep spectral–spatial features. A 2D convolution block further refines these features, followed by a novel attention mechanism with dynamic token pruning that substantially reduces the computational load by focusing on the most pertinent features. The process concludes with an adaptive average pooling layer and a fully connected layer for classification. Extensive experiments on four standard hyperspectral datasets demonstrate that EDTST achieves the highest classification accuracy, with a notable 3% improvement in overall accuracy on the WHU-Hi-HanChuan dataset, while requiring the shortest training and inference time among all compared state-of-the-art models from the past three years. These results validate the efficacy of our approach in achieving superior performance with markedly improved computational efficiency.

EDTST: Efficient Dynamic Token Selection Transformer for Hyperspectral Image Classification

Key Points

Abstract

Cite This Study

Also Consider

Also Consider