What does this research mean for the field?

The proposed cross-modal fusion method integrates 3D point cloud data with 2D images to enhance the accuracy of bridge defect identification, achieving a mean Intersection over Union (mIoU) of 0.91 for severe cracks and 85% accuracy for minor defects. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to enhance bridge defect inspection by combining 3D point cloud data with 2D image analysis for improved defect identification.

March 10, 2026Open Access

Cross‐Modal Fusion for 3D Point Cloud and 2D Images in Bridge Defect Inspection

Key Points

This research aims to enhance bridge defect inspection by combining 3D point cloud data with 2D image analysis for improved defect identification.
Developed a scale-adaptive two-stage cross-modal fusion method.
First stage involves PointNet++ for 3D semantic segmentation and depth quantification.
Second stage uses a 3D–2D–3D fusion method to detect minor defects.
Performed extensive evaluations on a cracked concrete beam.
Achieved a mean Intersection over Union (mIoU) of 0.91 for crack detection with depth over 2 cm.
YOLO11 attained 85% accuracy on 2D depth maps for defects under 2 cm.
Extracted crack depths fit well to the gamma distribution, indicating strong reliability.

Abstract

As the service life of bridges increases, timely and accurate identification of cracks is essential for ensuring structural safety and durability. Traditional inspection methods often rely on 2D images, which lack reliable depth information for assessing the severity and progression of defects. 3D point cloud technology complements traditional 2D vision‐based bridge defect detection by providing depth information for spatial analysis, assessing defect severity and potential extension. Hence, to address the challenges of bridge defect inspection, we propose a scale‐adaptive two‐stage cross‐modal fusion method that integrates 3D point cloud data with 2D images for accurate spatial defect identification. This approach explicitly represents and integrates multisource knowledge, providing a scientifically grounded and reliable solution to bridge defect detection. It supports knowledge‐intensive engineering tasks by combining the advantages of 3D geometric information and 2D semantic cues, enabling better depth quantification and crack assessment. In the first stage, PointNet++ and a registration algorithm is first developed for 3D semantic segmentation and depth quantification of severe cracking defects. In the second stage, a physically consistent 3D–2D–3D cross‐modal fusion method is proposed to detect minor defects missed in previous step, converting point cloud into depth maps and performing semantic segmentation for depth quantification of smaller defects. A cracked concrete beam is used for method evaluation. Results show that the proposed method is robust different defect scales, with PointNet++ attaining a mIoU of 0.91 with depth over 2 cm on crack point cloud and YOLO11 reaching 85% accuracy on 2D depth maps for defects under 2 cm. Extracted crack depths showed a strong fit to the gamma distribution.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Xiong et al. (Thu,) studied this question.

synapsesocial.com/papers/69af95de70916d39fea4df4c https://doi.org/https://doi.org/10.1155/stc/8695692

Bookmark

View Full Paper