What question did this study set out to answer?

The research aims to examine social biases in LLM outputs related to patient attributes and the influence of design choices on these biases.

April 6, 2026Open Access

Bias Patterns in the Application of LLMs for Clinical Decision Support

Key Points

The research aims to examine social biases in LLM outputs related to patient attributes and the influence of design choices on these biases.
Evaluated eight popular LLMs in clinical settings
Used standardized question-answering datasets with clinical vignettes
Employed red-teaming strategies to analyze bias based on demographics
Compared prompting techniques including Zero-shot and Chain of Thought
Identified various disparities across protected groups
Found that larger models were not necessarily less biased
Noted that medical fine-tuning didn't consistently outperform general-purpose models
Demonstrated that specific prompt phrasing significantly influenced bias patterns
Showed that Chain of Thought approaches effectively reduced bias

Abstract

Objectives.To investigate the extent to which Large Language Models (LLMs) exhibit social bias based on protected patient attributes and to determine how design choices, such as architecture and prompting strategies, influence these observed biases in clinical decision support.Methods.We evaluated eight popular LLMs, including general-purpose and clinically trained models, across three standardized question-answering datasets using clinical vignettes.We employed red-teaming strategies to analyze the impact of demographics on LLM outputs and compared various prompting techniques, including Zero-shot and Chain of Thought.Results.Our experiments reveal various disparities across protected groups.Notably, larger models were not necessarily less biased, and medical fine-tuning did not consistently outperform general-purpose models.Furthermore, specific prompt phrasing significantly influenced bias patterns, whereas reflection-type approaches like Chain of Thought effectively reduced biased outcomes.Conclusions.LLMs demonstrate significant social biases in clinical scenarios that are influenced by model architecture and prompt engineering.These findings highlight the critical need for rigorous evaluation and enhancement of LLMs before their integration into clinical decision support systems.Consistent with prior studies, we call for additional scrutiny to ensure equity in AI-driven healthcare applications.All code and data are available at https://github.com/healthylaife/FairCDSLLM.Doi: 10.

Bookmark

View Full Paper

Bookmark

View Full Paper

Bias Patterns in the Application of LLMs for Clinical Decision Support

Key Points

Abstract

Cite This Study