Key points are not available for this paper at this time.
Understanding and addressing unfairness in LLMs are crucial for responsible AI deployment. However, there is a limited number of quantitative analyses and in-depth studies regarding fairness evaluations in LLMs, especially when applying LLMs to high-stakes fields. This work aims to fill this gap by providing a systematic evaluation of the effectiveness and fairness of LLMs using ChatGPT as a study case. We focus on assessing ChatGPT's performance in high-takes fields including education, criminology, finance and healthcare. To conduct a thorough evaluation, we consider both group fairness and individual fairness metrics. We also observe the disparities in ChatGPT's outputs under a set of biased or unbiased prompts. This work contributes to a deeper understanding of LLMs' fairness performance, facilitates bias mitigation and fosters the development of responsible AI systems. Code and data are open-sourced on GitHub 1 .
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Wed,) studied this question.
www.synapsesocial.com/papers/68e74464b6db6435876be35c — DOI: https://doi.org/10.1109/ciss59072.2024.10480206
Yunqi Li
Lanjing Zhang
Yongfeng Zhang
Rutgers, The State University of New Jersey
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: