Sub-pixel matching of multimodal optical images is a critical step in the combined application of multiple sensors. However, structural noise and inconsistencies arising from variations in multimodal image responses usually limit the accuracy of matching. Phase congruency mutual-structure weighted least absolute deviation (PCWLAD) is developed as a coarse-to-fine framework. In the coarse matching stage, we preserve the complete structure and use an enhanced cross-modal similarity criterion to mitigate structural information loss by phase congruency (PC) noise filtering. In the fine matching stage, a mutual-structure filtering and weighted least absolute deviation-based method is introduced to enhance inter-modal structural consistency and to accurately estimate sub-pixel displacements adaptively. Experiments on three multimodal datasets—Landsat visible-infrared, short-range visible-near-infrared, and unmanned aerial vehicle (UAV) optical image pairs—show that PCWLAD achieves superior average performance compared with eight state-of-the-art methods, attaining an average matching accuracy of approximately 0.4 pixels.
Huang et al. (Sun,) studied this question.