Key points are not available for this paper at this time.
Scene sketch-to-image synthesis is a challenging task, especially when the sketches contain multiple objects of different classes. Existing methods interfere between different classes of objects when generating images from scene sketches, making it difficult to synthesis images with accurate object classes. In this paper, we propose a scene sketch-to-image generation method based on multi-object control, which can generate high-quality and class-accurate images from scene sketches and text prompts. We propose a sampling strategy based on segmentation mask and independent denoising, which can accurately control the classes of foreground objects and make foreground objects and background more harmonized. Our method is based on a pre-trained diffusion model without additional training overhead. Experiments on SketchyCOCO and SketchyScene datasets demonstrate that our method's capacity to generate realistic complex images from scene sketches and text prompts.
Cheng et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: