Hyperspectral images (HSIs) and Light Detection and Ranging (LiDAR) data offer complementary spectral and spatial information and are extensively applied to land cover classification. Nevertheless, current fusion–classification approaches frequently suffer from cross-modal feature entanglement and insufficient exploitation of LiDAR physical priors, particularly the Digital Surface Model (DSM), which limits the interpretability of learned features and restricts classification accuracy. To address these issues, this study presents a Physics-Guided Adaptive Decoupling and Collaborative Enhancement Network (ADCE-Net) that embeds explicit geometric guidance into multimodal feature learning. In ADCE-Net, the DSM serves as an explicit geometric conditioning signal to guide feature decoupling, decomposing input representations into modality-shared semantic features (SSF) and modality-specific discriminative features (MSF), thereby mitigating cross-modal interference at an early stage. Based on this decomposition, an adaptive collaborative enhancement mechanism is designed using bidirectional cross-attention and dynamic gating to achieve context-aware mutual refinement between SSF and MSF, facilitating more effective utilization of cross-modal complementary information. Furthermore, a multi-level collaborative classification architecture is constructed to integrate multi-scale contextual representations, enhancing spatial consistency and boundary delineation. Extensive experiments on three benchmark datasets—Trento, Houston 2013, and Muufl Gulfport—demonstrate that ADCE-Net achieves overall accuracies of 99.69%, 97.37%, and 94.90%, respectively, surpassing multiple representative methods including support vector machines, 3D convolutional neural networks, transformer-based models, and recurrent neural networks. Noticeable improvements are also achieved for minority classes and classes with highly similar spectral signatures. The DSM-driven physics guidance boosts both classification performance and feature interpretability, providing a reliable and explainable paradigm for multimodal remote sensing classification.
X et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: