Large language models (LLMs) have emerged as powerful tools for generating human-like text, transforming human-machine interactions. However, their widespread adoption has raised concerns about their potential to influence public opinion and shape political narratives. In this study, we investigate geopolitical bias in GPT-4o (OpenAI) and DeepSeek-R1 (DeepSeek), focusing on how these models respond to questions concerning international affairs and global conflicts. We designed a set of 50 geopolitical questions informed by prevalent themes and frequently asked queries observed in online discourse, media coverage, and public forums. Through qualitative and quantitative analysis of the models’ responses, we found that GPT-4o generally exhibited soft, Western-centric biases in framing and emphasis, while DeepSeek showed more explicit, nationalistic biases aligned with Chinese state perspectives. However, despite these biases, for a set of questions, the models’ responses are more aligned than expected, indicating that they can address sensitive topics without necessarily presenting directly opposing viewpoints. Our findings contribute to the emerging literature on LLM behavior by revealing how divergent political orientations in AI systems can affect the consistency and perceived neutrality of their outputs. Such disparities raise important concerns around trustworthiness, interpretability, and the potential for manipulation in politically sensitive domains. These results highlight the importance of critical evaluation of AI-generated content, especially as LLMs become increasingly embedded in public discourse and decision-making processes.
Pacheco et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: