Key points are not available for this paper at this time.
Advances in text-to-image generative models have made it easier for people to create art by just prompting models with text. However, creating through text leaves users with limited control over the final composition or the way the subject is represented. A potential solution is to use image prompts alongside text prompts to condition the model. To better understand how and when image prompts can improve subject representation in generations, we conduct an annotation experiment to quantify their effect on generations of abstract, concrete plural, and concrete singular subjects. We find that initial images improved subject representation across all subject types, with the most noticeable improvement in concrete singular subjects. In an analysis of different types of initial images, we find that icons and photos produced high quality generations of different aesthetics. We conclude with design guidelines for how initial images can improve subject representation in AI art.
Qiao et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: