To address the issues of difficult heterogeneous image registration and low segmentation accuracy caused by the severe lack of illumination and significant modal differences in concrete cracks in extremely dark environments, this paper proposes a two-stage processing framework of registration–fusion first, and decoupling–segmentation later. In the registration and fusion stage, a registration algorithm based on morphological priors and multi-level quadtree spatial constraints is designed. This approach transforms the problem from pixel grayscale matching to spatial topological matching, achieving a feature fusion of high infrared saliency and high visible light sharpness. In the segmentation stage, a Latent Frequency-Decoupled Topological Network (LFDT-Net) is proposed. It utilizes Discrete Wavelet Transform (DWT) to achieve high-fidelity frequency decoupling of the low-frequency infrared backbone and the high-frequency visible light edges. Furthermore, a Cross-Frequency Guidance Module is utilized to eliminate double-edged artifacts, and a skeleton-aware topological loss function is introduced to constrain the topological integrity of the cracks. Experimental results on a self-built heterogeneous multi-modal crack dataset demonstrate that the proposed method significantly outperforms existing mainstream methods in registration accuracy, fusion quality, and segmentation accuracy. Achieving a mean Intersection over Union (mIoU) of 81.7%, the method effectively suppresses background noise in dark environments and precisely restores the microscopic edges and continuous topological structures of faint cracks.
Li et al. (Thu,) studied this question.