Causal attention in trained transformers exhibits power-law decay in attention weight as a function of query–key separation, with a position-dependent enhancement near the start of the sequence. We previously interpreted this as the two-point function of a boundary conformal field theory (BCFT) on a strip whose left edge is the start of the sequence. The framework predicts that per-head conformal weight Δ, measured from short-context random-token attention, should positively rank-correlate across heads with a long-range "valley depth" measure related to the "lost-in-the-middle" phenomenon. We pre-registered the prediction Spearman ρ (Δ, valley) ≥ 0. 50, p ≤ 10⁻⁵, and tested it on seven decoder-only transformers (Pythia-410m/1. 4B/2. 8B, GPT-Neo-2. 7B, Qwen2-7B, OLMo-7B, Mistral-7B-v0. 3). Six confirmed; Pythia-2. 8B falsified at ρ = +0. 46. A per-layer diagnostic localizes the falsification to layers 22–27 and shows that GPT-Neo-2. 7B (the matched control: same parameter count, same training data, different training recipe) confirms with ρ = +0. 96 across all 32 layers. Fitting the full BCFT functional form (3 parameters per head: C, Δ, λ) on Pythia-2. 8B and GPT-Neo-2. 7B reveals that 88–94% of conformal heads prefer BCFT over the bare power law, that λ is mostly positive and well structured, and that ΔBCFT is closer to the SYK Δ = 1/4 prediction than ΔPL. However, the pre-registered scalar ρ (Δ, valley) becomes weaker with the cleaner ΔBCFT, while the joint (Δ, λ) → valley rank-regression explains substantially more variance. Two findings demand explanation: (i) ρ (λ, valley) is mostly negative across layers in both models, the opposite of the framework's prediction; (ii) GPT-Neo-2. 7B exhibits an alternating-layer pattern with two distinct populations of heads by boundary structure. We discuss what this changes about the framework, what we would do differently in a follow-up pre-registration, and where the most informative remaining tests lie. All code, raw per-head data, and the pre-registration document are in the public repository at Capacity-For-Evil/ariel.
Building similarity graph...
Analyzing shared references across papers
Loading...
Umphrey et al. (Fri,) studied this question.
synapsesocial.com/papers/69e47440010ef96374d8ff91 — DOI: https://doi.org/10.5281/zenodo.19629862
Ariel Umphrey
Mission Heritage Medical Group
Eldon Umphrey
Mission Heritage Medical Group
Mission Heritage Medical Group
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: