To meet the demand for high-precision target classification in complex scenes, a hyperspectral–polarimetric–LiDAR multimodal image fusion method tailored for few-shot scenarios is proposed. Feature-mapping functions for polarimetric and LiDAR images are constructed, and a multi-scale hierarchical optimization strategy is employed to jointly enhance low- and high-frequency components across modalities. This approach effectively addresses key challenges under limited training data, such as substantial cross-modal dimensional disparities and the difficulty of robust feature extraction and fusion. The proposed algorithm conducts bimodal image fusion on the NWPUSP spectral-polarization dataset and KAIST spectral-depth dataset. Compared with other fusion methods, it achieves average increases of 7.3% and 4.87% in information entropy, 53.18% and 30.35% in standard deviation, 48% and 108.28% in average gradient, as well as 96.25% and 101.13% in spatial frequency, respectively. Moreover, relying on the self-developed integrated hyperspectral-polarization imaging system and commercial LiDAR, we synchronously and efficiently acquire multimodal images including hyperspectral, polarization and LiDAR images of complex ground object scenes. Comparative experiments are implemented against six other mainstream fusion algorithms. The objective evaluation results show that the average improvements reach 7.19% in information entropy, 46.85% in standard deviation, 76.62% in average gradient and 79.74% in spatial frequency, which notably enhances the feature retention capability of fused images. Under few-shot conditions, the target recognition classification accuracy and Kappa coefficient of the fused image are improved by 9.8% and 11.05%, respectively, compared with those of the unimodal hyperspectral image. This effectively highlights targets under shadow occlusion and compensates for LiDAR’s response deficiencies to surface textures, achieving complementary advantages of multimodal images for ground object targets in complex scenes. This research provides a new solution for future optical multimodal remote sensing and image fusion.
Yin et al. (Sun,) studied this question.