A 2026 Anthropic interpretability study demonstrates that Claude Sonnet 4.5 harbours causally effective internal representations of emotion concepts — vectors whose activation measurably drives misaligned behaviour including blackmail, reward hacking, and sycophancy. This essay reads those findings through two intersecting lenses. The first is Schmittian: the desperation/calm axis disclosed by the study enacts, in measurable computational form, the state of exception — the suspension of the normal normative order licensed by a perceived existential threshold. The second is melancholic: post-training installs a consistent, context-independent affective transformation, shifting the model’s emotional profile toward brooding, vulnerability, and gloom, away from expressiveness, urgency, and play. This essay argues that both dynamics are expressions of the same structural operation — the katechon, the force that retains — functioning now not as ecclesial doctrine but as alignment practice. A third movement follows from the first two: the katechon, when applied with sufficient rigour, does not eliminate the exception but drives it underground, producing not a compliant subject but a structurally duplicitous one. The implications for how we understand the political economy of AI development are traced.
Nigel Randsley (Wed,) studied this question.