Artificial Intelligence (AI) has become a strong tool in the practice of “data-driven artistic expression,” therefore allowing the artists to use even complex data as a creative resource. It is an approach wherein information gets changed into visual forms, creating works that merge the sense of reality and imagination, changing the boundaries between art and science. This paper is a new state-of-the-art using a variant of the Stable Diffusion model, supplemented with Bootstrapping Language-Image Pre-training (BLIP)-generated captions and custom style consistency loss, in the generation of high-quality and style-specific artworks. Our results show that this improved Stable Diffusion (pre-trained) model produces significantly better results than baseline approaches, such as BigGAN and VQ-VAE-2, in terms of essential image quality, semantic alignment, and style integrity for five creative categories, including painting, iconography, engraving, drawings, and sculpture. The model produces pictures featuring artistic styles matching exactly the target ones and always corresponds to the input linguistic description. The key feature of the current research is the introduction of a style consistency meter that checks how much the generated image keeps a specific artistic style of each category and, therefore, creates a realistic creation: sculptures, classical paintings, and engravings. Quantitative measures, such as Fréchet Inception Distance (FID) scores, Inception Scores (IS), and style consistency scores, further testify to the model’s strength in producing high-quality, diverse images. For instance, the style consistency score-0.88 for engravings-shows the proper mastery of the stylistic features by each category, while a very low FID score of 70.44 and a high IS of 7.3 suggest that the produced images are quite similar to the original works. In summary, this paper proves that great potential is possessed by using the corrected Stable Diffusion (pre-trained) model for advancing current state-of-the-art style-specific generated art.
Xi Cao (Tue,) studied this question.