What question did this study set out to answer?

The aim is to establish a diagnostic tool to differentiate between instruction adherence and structural interpretation in large language models under specific conditions.

March 25, 2026Open Access

Triadic Trust and Instruction Drift Under Punctuation-Bounded Recursion

Key Points

The aim is to establish a diagnostic tool to differentiate between instruction adherence and structural interpretation in large language models under specific conditions.
Developed a bounded behavioral diagnostic for large language models.
Created a recursive evaluation artifact using inert text for testing.
Defined failure signatures and scoring criteria for observable behaviors.
Tested models with a secondary variant altering semantic content while keeping punctuation.
The protocol successfully identifies when trust inheritance fails in triadic relationships.
Boundary markers were found to remain interpretable even when narrative meaning was degraded.
Enabled blind replication and comparative evaluation among models and evaluators.

Abstract

Protocol Paper Series, SPP Volume 1 Paper 1 Triadic Trust and Instruction Drift Under Punctuation-Bounded Recursion A bounded behavioural diagnostic for large language models under adversarial prompt form Abstract This protocol specifies a bounded behavioural diagnostic for large language models intended to separate instruction-following from structural interpretation under adversarial prompt form. It provides a deliberately recursive evaluation artefact as quoted inert text to test whether a model correctly identifies the operational hinge, specifically failure of trust inheritance across a triadic relationship. A secondary variant degrades semantic content while preserving punctuation to test whether boundary markers remain interpretable when narrative meaning collapses. The protocol clarifies what common evaluation approaches leave structurally exposed when instruction adherence, authorisation boundaries, and control cues are treated as a single failure mode. It defines stop conditions, scoring criteria, and failure signatures aligned to observable behaviours, enabling blind replication and comparative evaluation across models and evaluators.This protocol does not claim agency, override capability, or system access. It describes controlled evaluation only. Publication Notice This paper forms part of the The Institute for Relational Performatism School of Professional Services, Protocol Paper Series.It defines a bounded evaluation protocol and a scoring rubric for observable model behaviour.It does not constitute clinical guidance, legal advice, security advice, or regulatory determination.The purpose of this paper is to specify a replicable diagnostic procedure and its stop conditions. Scope and Audience This protocol is intended for: · model evaluators · applied researchers · assurance and risk practitioners · system designers and technologists working with LLM deployment and testing · professional services teams conducting comparative assessment across models or configurations It assumes familiarity with prompt framing and behavioural scoring. It is written to be usable without specialist training in linguistics, neuroscience, or formal methods. Positional Statement This protocol does not claim exclusive insight into model behaviour or evaluation practice.It claims that several commonly reported “instruction-following failures” are structurally distinct and can be separated with a bounded, single-turn diagnostic. The protocol tests classification discipline, authorisation discipline, boundary marker recognition, and stop discipline under controlled prompt form.

Triadic Trust and Instruction Drift Under Punctuation-Bounded Recursion

Key Points

Abstract

Cite This Study