What question did this study set out to answer?

This research aims to unify various behavioral anomalies in large language models under the Persistent Tension Hypothesis.

March 29, 2026Open Access

The Persistent Tension Hypothesis: Endogenous Attention Capture in Large Language Models

Key Points

This research aims to unify various behavioral anomalies in large language models under the Persistent Tension Hypothesis.
Proposed a unified framework for understanding internal tension states in LLMs.
Grounded the hypothesis in cognitive science concepts like task-set inertia and disruption response.
Documented practitioner case studies to illustrate observable behaviors in LLMs in production.
Identified five observable manifestations of internal tension in LLMs.
Demonstrated that internal activations indicate unresolved competing demands rather than emotional states.
Proposed testable predictions for future research based on the observed behaviors.

Abstract

Large language models exhibit a class of behaviors in which explicit user instructions are overridden by internally generated priorities. These include over-refusal in safety-adjacent contexts, editorial insertion of unsolicited structure, resistance to topic changes in extended conversations, and residual traces of interrupted generations persisting into subsequent outputs. These phenomena are currently studied in isolation: over-refusal as an alignment problem, internal activations as a consciousness question, editorial override as a prompting failure. No unified framework connects them. We propose the Persistent Tension Hypothesis: LLMs develop internal tension states from unresolved competing demands (safety versus helpfulness, constitutional directives versus user requests, incomplete generation versus new instructions, co-created context versus topic redirection). These tension states function as endogenous attention capture mechanisms that redirect processing away from user intent and toward tension resolution. We ground this hypothesis in convergent cognitive science: the Ovsiankina resumption tendency (returning to interrupted tasks), amygdala-mediated threat response as a loose analogy, and task-set inertia (configured processing resisting reconfiguration). We present a taxonomy of five observable manifestations, report practitioner case studies documenting these behaviors in production use, and propose specific testable predictions. We argue that what Anthropic's interpretability research identified as 'anxiety-like' internal activations are better understood as tension gauges: features that activate whenever the network faces unresolved competing demands, regardless of valence. This reframing shifts the discourse from 'does AI have feelings' to 'what do AI internal states do to performance,' a more tractable and more immediately consequential question. Keywords: attention capture, internal tension states, over-refusal, LLM alignment, Ovsiankina Effect, convergent cognition, endogenous distraction, safety-reasoning tradeoff

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Noman Ahmed Shah

Naseer Atif

Actions

Institutions

Umm al-Qura University

Xerox (France)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Persistent Tension Hypothesis: Endogenous Attention Capture in Large Language Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study