Abstract Early large language models (LLMs) were released with minimal alignment, offering a rare view of how generative systems reframed the ethical values embedded in human texts. We examine outputs from a 2021 version of OpenAI’s base GPT-3, prompting it to summarise culturally diverse source materials (laws, political speeches, and philosophical works) and interpreting results through a descriptive, moral value pluralist lens. Where possible, we contextualise outputs with cross-national datasets such as the World Values Survey. We document recurring value drift: Australia’s firearm policy is recast as a threat to liberty; de Beauvoir’s feminist critique becomes gender-essentialist dating advice; and Merkel’s humanitarian appeal is reframed as immigration control. In contrast, multilateral documents (UN/UNESCO) exhibit greater value stability, suggesting consensus-crafted language can buffer against cultural mutation. We argue that these early behaviours (observed before extensive fine-tuning and safety layers) provide a historically important baseline for understanding how training distributions shape normative framing. Our contribution is twofold: (1) empirical evidence that value drift can invert or overwrite embedded values along predictable cultural axes, and (2) a pluralist, descriptive evaluation method that surfaces whose values dominate and when. We conclude with implications for culturally inclusive evaluation and alignment in contemporary LLMs.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rebecca L. Johnson
Giada Pistilli
Natalia Menéndez
The University of Sydney
Delft University of Technology
Università Cattolica del Sacro Cuore
Building similarity graph...
Analyzing shared references across papers
Loading...
Johnson et al. (Wed,) studied this question.
www.synapsesocial.com/papers/68c1872d9b7b07f3a0611838 — DOI: https://doi.org/10.21203/rs.3.rs-7503184/v1