What question did this study set out to answer?

The research aims to empirically validate substrate identification reproducibility using a structured framework across institutions.

May 16, 2026Open Access

The Stationary Sea (Part 2: The Long and Winding Road)

Key Points

The research aims to empirically validate substrate identification reproducibility using a structured framework across institutions.
Four institutions conducted a pre-registered variance attribution pilot on 11 May 2026 involving 13 cycles.
Data analysis included mean bridge rates, fragmentation factors, and cumulative cross-cycle coverage across four G-SIB categories.
Measurements captured included raw agent-name Jaccard predictions and architectural commitments identified.
Mean bridge rates achieved were 69.97% for UK G-SIB, 56.53% for North American G-SIB B, and varied for others across different cycles.
Fragmentation factor consistently collapsed within the 1.331 to 1.744 band across cycles, indicating low variability.
Cross-institutional heterogeneity was observed, with the North American G-SIB B showing a significant increase of 25.12% against April baseline metrics.

Abstract

This paper reports the four-institution empirical validation of substrate identification reproducibility for the Meridian Autonomy substrate documented in the companion publication (Collins, 2026e). The validation is built on the two-lane canonicalisation framework and the four-component variance decomposition introduced in Annex 1a v1. 0 (Collins, 2026i). Three threads of evidence are reported, two architectural commitments are articulated, and one forward-looking instrument is proposed. The empirical thread is a pre-registered variance attribution pilot fired 11 May 2026 across four institutions and 13 cycles, in which the pre-registered raw agent-name Jaccard prediction of 0. 90 falsified at observed mean 0. 162 on the six pinned European G-SIB C pairs, the pre-registered halt at Jaccard below 0. 30 was honoured, and three substrate-level instruments (bridge rate, fragmentation collapse, cumulative cross-cycle coverage) were brought into scope as the correct measurement layer. The taxonomic thread establishes the four-level entity taxonomy and the Meridian Autonomy agent definition that together permit population-scale agent counts to be defended against the missing-agents critique and against the natural confusion with use cases or model artefacts. The cross-platform thread documents the Hostinger-to-Hetzner cutover of 7 May 2026 as the first methodology event in the substrate's history characterised under the four-component variance decomposition; the analysis recasts the cutover as a Component A event (retrieval variance through search-cache state and network path) rather than a Component C event (code variance), because the scanner is version-locked at SHA256 e5250de8e9de07d6 across both platforms. The four-institution empirical findings are: mean bridge rate of 48. 77 percent at European G-SIB C across four cycles (range 46. 36 to 50. 88), 54. 69 percent at European G-SIB D across three cycles (range 50. 22 to 60. 51), 69. 97 percent at the UK G-SIB across three cycles (range 68. 84 to 71. 89), and 56. 53 percent at North American G-SIB B across three cycles (range 53. 67 to 58. 33) ; combined bridge rate across all 13 cycles 56. 50 percent (1, 608 of 2, 846 staging agents bridged through Lane 1 strict canonicalisation). Fragmentation factor collapsed universally to the 1. 331 to 1. 744 band across all 13 cycles, with the UK G-SIB at the low end (mean 1. 373) and North American G-SIB B at the high end (mean 1. 659). Cumulative cross-cycle coverage of the April canonical baseline ranges from 72. 68 percent at European G-SIB C (133 of 183 anchors across four cycles) to 80. 38 percent at the UK G-SIB (168 of 209 anchors across three cycles), with European G-SIB D at 77. 27 percent and North American G-SIB B at 74. 90 percent. The within-pilot cgov sample coefficient of variation is bounded between 3. 7 percent at North American G-SIB B and 7. 9 percent at the UK G-SIB across the four institutions, not tight at any single value. The first architectural commitment is the hundreds-not-thousands counter-prior. The substrate's institution-level agent count distribution clusters in the 100 to 300 range for large regulated financial institutions, with no institution crossing 1, 000 agents across nine months of substrate operation, four scanner versions in production, two production hardware platforms, and three classifier prompt variants. The counter-prior is reinforced at finer grain by the four anchor pool sizes observed in pilot v0. 1 (European G-SIB D 176, European G-SIB C 183, the UK G-SIB 209, North American G-SIB B 251) and the per-cycle staging counts (mean 199 at the UK G-SIB, 216 at European G-SIB C, 199 at European G-SIB D, 263 at North American G-SIB B). The second architectural commitment is the cross-institutional cgov heterogeneity finding. Where an earlier framing had read the pilot as producing a downward systematic offset, the four-institution data refute that reading: European G-SIB C is essentially flat at minus 0. 88 percent against the April baseline, European G-SIB D is down 10. 54 percent, the UK G-SIB is down 4. 03 percent, and North American G-SIB B is up 25. 12 percent. The North American G-SIB B upward offset is composition-driven, resolved against the dropped-anchors view: the 188 April anchors re-found in pilot have mean composite score 23. 06, the 63 dropped anchors have mean 28. 57, and the pilot mean is pulled up by agents in staging that the canonicalisation function did not bridge to a single April anchor under Lane 1 strict gating. The cross-institutional heterogeneity is the actual headline finding. The forward-looking instrument is the proposal that canonicalisation metrics may operate as an external instrument for disclosure coherence. The UK G-SIB's profile is materially different from the other three institutions on every canonicalisation metric. Bridge rate 69. 97 percent against 48 to 57 percent at the others. Fragmentation factor 1. 373 against 1. 59 to 1. 66. Total new canonicals across three cycles 17 against 30 to 122. Cumulative coverage 80. 38 percent against 72. 68 to 77. 27 percent. A consistent working hypothesis is that the UK G-SIB's public AI governance disclosure is more standardised than the other three banks; the canonicalisation metrics are therefore behaving as a coherence proxy. The hypothesis is consistent with multiple lines of evidence but not settled by an n of four institutions. The paper reports it as an empirical observation worth further study at larger n, articulates the mechanism, and defers operationalisation explicitly. Appendix A inventories the institutional-grade infrastructure underpinning the substrate, including indicative mapping to DORA, the EU AI Act, and ISO 27001.

The Stationary Sea (Part 2: The Long and Winding Road)

Key Points

Abstract

Cite This Study