August 15, 2025Open Access

Cross Attention Based Dual-Modality Collaboration for Hyperspectral Image and LiDAR Data Classification

Key Points

Novel model significantly enhances classification accuracy by effectively merging data from different modalities,
CAB-HL achieved superior performance on three benchmark datasets—outpacing traditional approaches by demonstrating enhanced representational learning,
The study employed a multi-stage cross-attention mechanism to refine and align features, leading to improved spatial clarity and reduced redundancy,
This work highlights the importance of advanced feature fusion techniques in the evolving domain of remote sensing.

Abstract

Advancements in satellite sensor technology have enabled access to diverse remote sensing (RS) data from multiple platforms. Hyperspectral Image (HSI) data offers rich spectral detail for material identification, while LiDAR captures high-resolution 3D structural information, making the two modalities naturally complementary. By fusing HSI and LiDAR, we can mitigate the limitations of each and improve tasks like land cover classification, vegetation analysis, and terrain mapping through more robust spectral–spatial feature representation. However, traditional multi-scale feature fusion models often struggle with aligning features effectively, which can lead to redundant outputs and diminished spatial clarity. To address these issues, we propose the Cross Attention Bridge for HSI and LiDAR (CAB-HL), a novel dual-path framework that employs a multi-stage cross-attention mechanism to guide the interaction between spectral and spatial features. In CAB-HL, features from each modality are refined across three progressive stages using cross-attention modules, which enhance contextual alignment while preserving the distinctive characteristics of each modality. These fused representations are subsequently integrated and passed through a lightweight classification head. Extensive experiments on three benchmark RS datasets demonstrate that CAB-HL consistently outperforms existing state-of-the-art models, confirm that CAB-HL consistently outperforms in learning deep joint representations for multimodal classification tasks.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Hussain et al. (Fri,) studied this question.

synapsesocial.com/papers/68af50a1ad7bf08b1ead8ba4 https://doi.org/https://doi.org/10.3390/rs17162836

Bookmark

View Full Paper