What question did this study set out to answer?

The study aims to investigate the differences in visual creativity between human artists and AI-generated images.

March 26, 2026Open Access

Stable Diffusion Models Reveal a Persisting Human–AI Gap in Visual Creativity

Key Points

The study aims to investigate the differences in visual creativity between human artists and AI-generated images.
Compared image generation in Visual Artists, Non-Artists, and AI under different prompting conditions.
Evaluated creativity with human raters and an AI evaluator (GPT-4o) under two conditions.
Analyzed creativity gradient across groups based on human input levels.
Visual Artists showed highest creativity, followed by Non-Artists, then Human-Inspired GenAI, and Self-Guided GenAI.
Increased human guidance significantly enhanced GenAI's creativity, nearly matching Non-Artists.
Guided-GPT-4o produced human-like evaluations, unlike baseline GPT-4o which had different patterns.

Abstract

While recent research suggests Large Language Models match human creative performance in divergent thinking tasks, visual creativity remains underexplored. This study compared image generation in human participants (Visual Artists and Non-Artists) and using an image-generation AI model (two prompting conditions with varying human input: high for Human-Inspired, low for Self-Guided). The creativity of the resulting images was evaluated by human raters (N = 255) and GPT-4o acting as an AI rater under two conditions: strictly mirroring the human rating task and using in-context learning with human-rated examples as guidance. We observed a clear creativity gradient: Visual Artists > Non-Artists ≥ Human-Inspired GenAI > Self-Guided GenAI. Increased human guidance strongly improved GenAI's creative output, bringing its productions close to those of Non-Artists. Moreover, while Guided-GPT-4o more closely approximated human creativity judgment patterns, baseline GPT-4o (without guidance) exhibited markedly different creativity evaluations, showing reduced discrimination between image categories and inflated scores for GenAI outputs. These results suggest that, in contrast to language-centered tasks, GenAI models may face unique challenges in visual domains, where creativity depends on perceptual nuance and contextual sensitivity, distinctly human capacities that may not be readily transferable from language models.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Rondini et al. (Tue,) studied this question.

synapsesocial.com/papers/69c4cd73fdc3bde448919d52 https://doi.org/https://doi.org/10.1002/advs.202524142

Bookmark

View Full Paper