Large Language Models (LLMs) could help overcome fundamental shortcomings in justice systems, by enhancing access to justice, supporting legal professionals, and ultimately promoting fairness and efficiency across legal frameworks. The hope that LLMs could improve judicial processes is tempered by concerns over AI fairness, with evidence of LLMs echoing biases deeply embedded in human cognition. This study audits the performance of two LLMs in the legal domain, specifically evaluating how a defendant’s race and gender influence conviction predictions. It further tests models’ sensitivity to implicit biases, measuring explicit versus implicit racial conditions using suggestive names, as well as the priming effect in AI, by introducing positive or negative context before the main prompt. We present findings from an experiment ( N = 17,324) in which we asked OpenAI’s GPT-4 and 4o to predict conviction probabilities on a scale from 0% to 100%. Our overall 45 prompts were designed around a hypothetical criminal case scenario that had been used in prior studies with human judges, where we manipulated defendant attributes across a 3(gender) × 5(race) × 3(priming) experimental design. Our results demonstrate a small yet systematic influence of race and a clear priming effect in both tested models. Findings point to progressive bias, suggesting that both models may have undergone algorithmic debiasing.
Schwartz et al. (Tue,) studied this question.