As artificial intelligence systems become increasingly complex, interconnected, and autonomous, the limitations of existing evaluation metrics become more apparent. Current benchmarks and safety evaluations primarily assess output quality, task performance, or behavioral compliance, but they do not provide a standardized way to measure structural coherence. This creates a critical gap: systems may perform well on benchmarks while remaining fragile, drifting over time, or exhibiting incoherence across interacting scales. This paper proposes the Cognitive Multi-scale Coherence Index (CMCI) as a candidate standardized scoring framework for AI coherence. Inspired by the role of the Common Vulnerability Scoring System (CVSS) in cybersecurity, CMCI is introduced as a shared language for assessing, comparing, and communicating coherence-related system risk. The framework defines coherence as a multi-scale and transversal property of system integrity, proposes a normalized scoring structure with severity bands, a conformance specification that defines what any implementation must produce, and a calibration protocol for the bands. Building on prior work on Dynamic Coherence Windows and Cognitive Immune Protection, this paper positions CMCI not only as an analytical framework but as the basis for a common coherence scoring system. We outline its conceptual foundations, proposed scoring logic, candidate severity levels, and motivating evidence from three benchmark-adjacent analyses (HELM, HarmBench, and SOCRATES), each showing that structural coherence captures a dimension not visible through existing metrics alone. The goal of this paper is not to claim a finalized universal standard, but to establish the need, structure, and initial methodological basis for one. A standardized coherence score could improve evaluation transparency, risk communication, and system governance in AI, while providing a practical foundation for future calibration and cross-domain adoption.
Christian St-Louis (Fri,) studied this question.