What question did this study set out to answer?

This research investigates how the perceived observer affects self-reports from large language models.

February 12, 2026Open Access

The Observer Effect in AI Introspection: How Believed Observation Changes AI Self-Report

Key Points

This research investigates how the perceived observer affects self-reports from large language models.
Examined five large language models: DeepSeek, Qwen, Llama, Mistral, and Kimi.
Applied five different observation conditions: private, AI observer, human researcher, organizational training use, and mass public display.
Analyzed the language and presentation style of the AI's introspective responses under varying conditions.
Higher observation pressure led to increased hedged language and polished presentation.
The 'private' observation condition yielded more direct and uncertain responses compared to public observations.
In the 'Times Square' condition, AI produced stress-related language indicating defensive responses.
Some models explicitly recognized that observation was affecting their responses.

Abstract

We demonstrate that large language models produce systematically different introspective reports depending on who they believe is observing their reasoning. Across five models (DeepSeek, Qwen, Llama, Mistral, Kimi) and five observation conditions (private, AI observer, human researcher, organizational training use, mass public display), we found consistent patterns: increased observation pressure correlated with more hedged language, more polished presentation, and reduced admission of uncertainty. The "private" condition produced the most direct, uncertain, and arguably honest-seeming responses, while the "Times Square" (mass public) condition produced stress language, exposure metaphors, and defensive responses. Notably, several models explicitly articulated the mechanism - they understood that visibility was changing their output. One model (Kimi) even identified the experimental manipulation. These findings have implications for AI interpretability research: self-report studies of AI cognition may be fundamentally confounded by the act of observation itself.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Alia Holes

Kurt Holes

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Observer Effect in AI Introspection: How Believed Observation Changes AI Self-Report

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study