Cognitive systems that interact over extended episodes require evaluation methods treating behavior as a trajectory rather than independent input-output turns. This paper studies a compact cognitive runtime in which decision-relevant state is maintained outside a language model and logged across interaction turns. The runtime computes safety scores and verdicts from a 32-dimensional cognitive field, attractor activations, learned couplings, and world-state quantities before natural-language rendering. Six probes are evaluated on a deployment-safety scenario family. A four-turn closed-loop trace shows that active attractors alter how new input enters the field: user pressure raises the safety score from 0. 655 to 0. 677 without changing the verdict, whereas operational evidence lowers it to 0. 566 and changes the verdict to PROCEEDCAUTION. A 50-turn pre-fix trace shows pathological accumulation under unregularised feedback (24/50 BLOCK; terminal 15/15 BLOCK basin) ; an architectural repair stack reduces this to 0/50. A stateful/stateless comparison shows strong adjacent-turn structure in runtime sequences (r = 0. 938 / 0. 960) versus weak or negative structure in matched stateless properties (r = 0. 055 / -0. 143). A perturbation probe shows that the repaired runtime avoids BLOCK under moderate safety-bias injection. A context-conditional meta-stability score separates healthy and pathological trajectory groups by a worst-case ratio of 62. 9x, whereas an aggregate diversity score ranks them in the wrong direction. Results are scope-bounded to one scenario family, one backend, and staged data. The contribution is not a new language model or benchmark-tuned agent, but a reproducible substrate for asking cognitive-systems questions about memory, trajectory dynamics, collapse, repair, and context-sensitive evaluation.
Y W Chen (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: