Drawing on the chivalry hypothesis and attribution theory, this study examines whether large language models (LLMs) exhibit gender-based differentiation in sentencing recommendations and stigmatizing evaluations across intimate partner violence, robbery, and financial fraud. Using an experimental vignette design, six AI models evaluated gender-manipulated criminal scenarios, providing sentencing recommendations and stigmatization ratings. Repeated prompts were used to examine both within-model consistency and between-model variation, revealing offense-specific patterns. Within models, significant gender disparities emerged in intimate partner violence scenarios, with more lenient sentencing and lower stigmatization of female perpetrators, whereas such differences were weak or absent in robbery and financial fraud. In addition, within-model analyses revealed that the strength and consistency of the association between stigmatizing evaluations and sentencing severity varied across AI systems. Between-model analyses revealed substantial heterogeneity across offense types, with AI systems differing not only in their gender-based sentencing recommendations, but also in overall levels of stigmatizing evaluations in response to gender manipulations. Overall, the results suggest that AI systems neither uniformly replicate nor fully transcend human gender biases, underscoring the need for cautious deployment of AI tools in legal contexts.
Lam et al. (Thu,) studied this question.