Accurate, rapid, and consistent evaluation of pavement condition across large-scale road networks is critical for sustainable maintenance and rehabilitation planning. However, conventional approaches largely rely on manual visual inspections, which are time-consuming, subjective, and difficult to implement at the network level. In this study, a semi-automated pavement distress evaluation framework that integrates field-based assessment with computer vision techniques is proposed. The study was conducted on a 3 km roadway network located within the Yıldız Technical University Davutpaşa Campus. Field-based distress observations were used as reference data, while street-level images obtained from the Mapillary platform were analyzed using a deep learning-based YOLOv8 model trained on the RDD2022 dataset, which was specifically developed for road distress detection. The analysis focuses on crack and pothole distress, which have a dominant influence on PCR and are highly distinguishable in image-based approaches. Correlation analyses between automated detection results and field-based data demonstrate a strong agreement, reaching values of approximately ρ≈0.90 in some routes. These findings indicate that these distress types are effective in representing variations in pavement condition. The results demonstrate that multi-source image data and deep learning-based detection methods can be reliably used for section-level pavement condition assessment. The proposed approach addresses a key gap in the literature by transforming image-level detections into engineering-based decision-support information. Furthermore, by leveraging publicly available data sources, the framework offers a low-cost and scalable solution that enables rapid preliminary assessment over large road networks, thereby providing significant potential for sustainable infrastructure management and the development of data-driven maintenance strategies. Several practical challenges encountered during the detection process—including sensitivity to contrast enhancement parameters, false positives from shadows and surface reflections, heterogeneous image resolution across crowdsourced imagery, and training distribution gaps for locally prevalent infrastructure features—are discussed, and directions for reducing human intervention through adaptive preprocessing and targeted model refinement are identified.
Sitilbay et al. (Thu,) studied this question.