Abstract Advances in generative AI have enabled visual content creation through text-to-image (T2I) generation. Despite their creative potential, T2I models often replicate and amplify societal stereotypes related to gender, race, and culture. This paper introduces a theory-driven bias detection rubric and a Social Stereotype Index (SSI) to systematically evaluate bias in T2I outputs. We audited three major T2I model outputs–DALL-E-3, Midjourney−6. 1, and Stability AI Core with 100 queries across geocultural, occupational, and adjectival categories. Results show recurring stereotypes, including gendered professions, cultural markers, and Western beauty norms. Using our rubric, we applied prompt refinement, which reduced SSI scores by 58% (geocultural), 66% (occupational), and 53% (adjectival). We conducted a complementary user study, which revealed tensions—while refinement mitigates bias, it may weaken contextual alignment, and participants often viewed stereotypical imagery as more “expected. ” We call for T2I systems to balance ethical debiasing with contextual relevance, supporting inclusivity without oversimplifying social realities.
Barve et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: