Key points are not available for this paper at this time.
Convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification tasks because of their excellent local spatial feature extraction capabilities. However, because it is difficult to establish dependencies between long sequences of data for CNNs, there are limitations in the process of processing hyperspectral spectral sequence features. To overcome these limitations, inspired by the Transformer model, a spatial–spectral transformer with cross-attention (CASST) method is proposed. Overall, the method consists of a dual-branch structures, i.e., spatial and spectral sequence branches. The former is used to capture fine-grained spatial information of HSI, and the latter is adopted to extract the spectral features and establish interdependencies between spectral sequences. Specifically, to enhance the consistency among features and relieve computational burden, we design a spatial–spectral cross-attention module with weighted sharing to extract the interactive spatial–spectral fusion feature intra Transformer block, while also developing a spatial–spectral weighted sharing mechanism to capture the robust semantic feature inter Transformer block. Performance evaluation experiments are conducted on three hyperspectral classification datasets, demonstrating that the CASST method achieves better accuracy than the state-of-the-art Transformer classification models and mainstream classification networks.
Peng et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: