Existing methods for evaluating AI behaviour conflate personality measurement with specification compliance. This paper presents the Specification Profiling Framework (SPF), a specification-verification method that produces machine-readable evidence of whether an AI system's observable output conforms to an explicit behavioural specification. SPF evaluates systems across eight behavioural constraints using a two-turn protocol that isolates specification effects from baseline behaviour. Methodology validation with four commercial AI systems reveals significant per-system variation: compliance ranges from 0/8 to 6/8 constraints. A specification reversal anomaly (D8 DomainStrictness) demonstrates that multi-dimensional separated assessment surfaces structural failures invisible to scalar scoring. All evidence artefacts are structured (JSON), reproducible, and mapped to EU AI Act conformity assessment requirements (Annex A).
Kafkas M. Caprazli (Thu,) studied this question.