Key points are not available for this paper at this time.
This paper presents a novel methodology for generating synthetic images that adhere accurately to provided semantic segmentation maps using the Stable Diffusion model with the ControlNet extension. By leveraging three ControlNet modules-depth map, canny edge detection, and segmentation map-our approach produces high-quality images that precisely match the segmentation maps of original input images. The results demonstrate a noticeable improvement in model performance when training semantic segmentation models with these synthetic images, enhancing the overall quality of the Cityscapes training dataset. Additionally, our findings highlight the potential of generative AI in producing synthetic data that significantly improves real-time semantic segmentation. This research under-scores the substantial benefits of synthetic image generation in augmenting existing datasets and advancing the accuracy and robustness of semantic segmentation models.
Bevacqua et al. (Wed,) studied this question.