What type of study is this?

This is a Quantitative Study study.

September 17, 2025Open Access

CCFormer: Cross-Modal Cross-Attention Transformer for Classification of Hyperspectral and LiDAR Data

Key Points

The proposed method achieves improved classification accuracy by effectively integrating hyperspectral and Lidar data.
Experiments show significant advantages over traditional approaches, particularly concerning feature redundancy and alignment.
Utilizing a two-level pyramid architecture facilitates multi-scale feature extraction from complex datasets.
A novel cross-attention mechanism enhances the correlation between spectral and elevation data during classification.

Abstract

The fusion of multi-source remote sensing data has emerged as a critical technical approach to enhancing the accuracy of ground object classification. The synergistic integration of hyperspectral images and light detection and ranging data can significantly improve the capability of identifying ground objects in complex environments. However, modeling the correlation between their heterogeneous features remains a key technical challenge. Conventional methods often result in feature redundancy due to simple concatenation, making it difficult to effectively exploit the complementary information across modalities. To address this issue, this paper proposes a cross-modal cross-attention Transformer network for the classification of hyperspectral images combined with light detection and ranging data. The proposed method aims to effectively integrate the complementary characteristics of hyperspectral images and light detection and ranging data. Specifically, it employs a two-level pyramid architecture to extract multi-scale features at the shallow level, thereby overcoming the redundancy limitations associated with traditional stacking-based fusion approaches. Furthermore, an innovative cross-attention mechanism is introduced within the Transformer encoder to dynamically capture the semantic correlations between the spectral features of hyperspectral images and the elevation information from light detection and ranging data. This enables effective feature alignment and enhancement through the adaptive allocation of attention weights. Extensive experiments conducted on three publicly available datasets demonstrate that the proposed method exhibits notable advantages over existing state-of-the-art approaches.

CCFormer: Cross-Modal Cross-Attention Transformer for Classification of Hyperspectral and LiDAR Data

Key Points

Abstract

Cite This Study