July 12, 2024Open Access

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Key Points

Key points are not available for this paper at this time.

Abstract

Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Omri Avrahami

Amir Hertz

Yael Vinker

Actions

Institutions

Tel Aviv University

Hebrew University of Jerusalem

Google (Israel)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study