Key points are not available for this paper at this time.
Text-to-image AI systems have proven to have extraordinary generative capacities that have facilitated widespread adoption. However, these systems are primarily text-based, which is a fundamental inversion of what many artists are traditionally used to: having full control over the composition of their work. Prior work has shown that there is great utility in using text prompts and that AI augmented workflows can increase momentum on creative tasks for end users. However, multimodal interactions beyond text need to be further defined, so end users can have rich points of interaction that allow them to truly co-pilot AI-generated content creation. To this end, the goal of my research is to equip creators with workflows that 1) translate abstract design goals into prompts of visual language, 2) structure exploration of design outcomes, and 3) integrate creator contributions into generations.
Vivian Liu (Wed,) studied this question.