This study evaluates the capability of ChatGPT (OpenAI GPT-3.5-turbo, accessed via API, Oct 2025) as evaluator of media bias while also examining to serve as an automated debiasing assistant for news content. We propose a three-phase pipeline - Identify, Rewrite, Evaluate - applied to a stratified dataset of 126 sentences drawn from 11 news outlets across the political spectrum, selected using Ad Fontes Media bias ratings, using a novel scoring framework to quantify bias across three dimensions: Framing, Emotional language, and Divisive (Us-vs-Them) language (F.E.D. scores). Following an explicit validation of the bias evaluation framework against established bias ratings and inter-model consistency, automated analysis of high-bias sentences (F.E.D. ≥ 6) shows significant reductions in perceived bias: 79%, 69%, and 78% across the three dimensions, respectively. Human evaluation confirms that rewrites are perceived as less biased (original texts selected as more biased 356 times vs. 90 for rewrites). However, a key trade-off emerges: participants preferred the original, more biased content for engagement (90 selections) over AI rewrites (48). This indicates that naive debiasing capabilities may inadvertently strip away engaging journalistic elements. We conclude that while large language models are effective for bias detection and suggestion, their optimal role is assistive, flagging content for human editors rather than operating autonomously. Future work must address the core challenge of reducing bias without incurring a significant engagement penalty.
Aarav Daftary (Thu,) studied this question.