Key points are not available for this paper at this time.
Abstract Generative adversarial networks (GANs) and Stable Diffusion represent two un- supervised techniques within the field of Deep Learning, utilized for discerning the underlying structure within multimodal imaging data. It is challenging to train GANs with stable dif- fusion because of two main issues: mode collapse and non-convergence. A workable way to tackle these two problems using GAN and Stable Diffusion is to rework the network archi- tecture to obtain a more powerful model. In this project, GANs and Stable Diffusion-based systems are designed and implemented for image generation and analysis. The different frameworks of GANs and Stable Diffusion are used for the interpretation of the images, such as stable-diffusion-2-base, stable-diffusion-2-1, GAN and unsupervised text to image translation models. The performance analysis is carried out by incorporating additional hy- brid architecture. The CLIP scores and Real/Fake scores metrics to evaluate Stable Diffusion and GAN performance respectively. The proposed GAN model achieved the accuracy of 81% and 91% on FFHQ dataset and Flicker dataset respectively. The proposed Stable Diffusion model achived 31.98 CLIP score on Flicker dataset. The Stable Diffusion models outper- forming GANs by generating more realistic images. Such systems are useful for Criminal images generation on the basis of few available images of the criminal, Image Datasets, Medical Image Generation, Art Generation, Data Augmentation, Forensics and Simulations, Text-to-Image Translation, Super Resolution, etc.
Wagh et al. (Thu,) studied this question.