Key points are not available for this paper at this time.
Automatic anime sketch colorization aims to generate a color image from a sketch image, which is challenging due to limited structure and semantic understanding, leading to constrained style, and semantic color inconsistency. In this paper, we introduce a sketch to color diffusion model with semantic prompt learning (SPL), learning better semantic prompts to stimulate the powerful structure and semantic understanding capabilities of large-scale multi-modal diffusion models, effectively bridging the gap between sketch and color. We introduce two distillation strategies for learning semantic prompts: one is prediction-level distillation by optimizing the global knowledge distillation loss and the local activation knowledge distillation loss, and the other is feature-level distillation, which optimizes hierarchy-wise feature distillation loss to transfer knowledge to output features of different hierarchies in the model. The experimental results show that our proposed distillation strategies generate high-quality semantic prompts, resulting in image quality that exhibits a superior visual effect compared to current automatic anime sketch colorization methods.
Wang et al. (Mon,) studied this question.