What question did this study set out to answer?

To enhance land cover classification accuracy by addressing feature entanglement and leveraging physical priors in remote sensing data.

June 12, 2026Open Access

A Multi-Modal Remote Sensing Image Classification Method Based on Physics-Guided Feature Decoupling and Adaptive Collaborative Fusion of HSI–LiDAR

Key Points

To enhance land cover classification accuracy by addressing feature entanglement and leveraging physical priors in remote sensing data.
Developed the Physics-Guided Adaptive Decoupling and Collaborative Enhancement Network (ADCE-Net).
Utilized Digital Surface Model (DSM) for geometric guidance in feature decoupling.
Conducted experiments on three benchmark datasets: Trento, Houston 2013, Muufl Gulfport.
Achieved overall accuracies of 99.69% for Trento, 97.37% for Houston 2013, and 94.90% for Muufl Gulfport.
Surpassed traditional methods including support vector machines and neural networks in performance.
Improved classification of minority classes and those with similar spectral signatures.

Abstract

Hyperspectral images (HSIs) and Light Detection and Ranging (LiDAR) data offer complementary spectral and spatial information and are extensively applied to land cover classification. Nevertheless, current fusion–classification approaches frequently suffer from cross-modal feature entanglement and insufficient exploitation of LiDAR physical priors, particularly the Digital Surface Model (DSM), which limits the interpretability of learned features and restricts classification accuracy. To address these issues, this study presents a Physics-Guided Adaptive Decoupling and Collaborative Enhancement Network (ADCE-Net) that embeds explicit geometric guidance into multimodal feature learning. In ADCE-Net, the DSM serves as an explicit geometric conditioning signal to guide feature decoupling, decomposing input representations into modality-shared semantic features (SSF) and modality-specific discriminative features (MSF), thereby mitigating cross-modal interference at an early stage. Based on this decomposition, an adaptive collaborative enhancement mechanism is designed using bidirectional cross-attention and dynamic gating to achieve context-aware mutual refinement between SSF and MSF, facilitating more effective utilization of cross-modal complementary information. Furthermore, a multi-level collaborative classification architecture is constructed to integrate multi-scale contextual representations, enhancing spatial consistency and boundary delineation. Extensive experiments on three benchmark datasets—Trento, Houston 2013, and Muufl Gulfport—demonstrate that ADCE-Net achieves overall accuracies of 99.69%, 97.37%, and 94.90%, respectively, surpassing multiple representative methods including support vector machines, 3D convolutional neural networks, transformer-based models, and recurrent neural networks. Noticeable improvements are also achieved for minority classes and classes with highly similar spectral signatures. The DSM-driven physics guidance boosts both classification performance and feature interpretability, providing a reliable and explainable paradigm for multimodal remote sensing classification.

A Multi-Modal Remote Sensing Image Classification Method Based on Physics-Guided Feature Decoupling and Adaptive Collaborative Fusion of HSI–LiDAR

Key Points

Abstract

Cite This Study

Also Consider

Also Consider