Why does the same instructional support that helps a novice harm an expert? The expertise reversal effect is well established in instructional psychology but lacks a process-level mechanism. This paper proposes a recursive strategy-choice race account and tests it in language models used as a measurement substrate, where the competition between strategies can be read token-by-token as a competing-routes signature. Abstract. A substrate able to hold multiple strategies for a problem opens a competition between them whenever incoming information does not match its existing commitment; the competition costs computation while it remains unresolved, and a substrate near its competence threshold for the task pays most. On an arithmetic composition task, a one-shot demonstration of a procedure the model already executes leaves a small model unaffected, helps a mid-capacity model, and harms a high-capacity one, with a friction peak at the strategy-crossover. The capacity-graded direction is itself known (larger models are more sensitive to in-context information); the contribution here is that the harm arises from a correct, non-contradictory demonstration, the defining signature of expertise reversal rather than of prior-conflict, and that it appears as a per-token friction peak. A within-family control isolates capacity from model-family and post-training confounds. Further experiments separate demonstration clarity from length and distinguish a format-violation reactance signature from ordinary load. Language models are positioned as a model system for a cognitive mechanism whose human-side confirmation remains future work. v4 changelog (June 2026). A major revision toward an interdisciplinary cognitive-science venue, with each substantive addition passing external dual hostile review. Changes: (1) the mechanistic account is made self-contained, with the winning-route-amplification argument given as a deductive consequence of cross-entropy training on a softmax distribution rather than resting on companion results; (2) a new second-task generality probe (relational/transitive composition) reports an honest boundary — the expertise-reversal capacity-ordering does NOT generalise to the second task: the high-capacity model is unaffected (no detectable effect, underpowered against a small one) while a lower-capacity model is harmed through a different failure mode (low-friction confident-wrong demonstration-anchoring, not the high-friction open race), so neither the capacity-ordering nor the per-token signature transfers, and claims about expertise reversal must be conditioned on the task; (3) the within-family capacity control is elevated into the main results; (4) scope discipline throughout (cross-substrate and human parallels framed as model-generated hypotheses and future work, not as measurements of human cognition); and (5) the manuscript is refocused to four core experiments in the main text, with the remaining experiments and the extended cross-substrate analyses moved to a Supplementary Material section (same content as v1-v3, reorganised). Companion papers in the Friction Theory series (Zenodo-live, concept DOIs): Paper 0 (Behavioural Friction Theory): 10.5281/zenodo.19462499; Paper 1 (Friction Theory substrate): 10.5281/zenodo.20012654; Paper 2B (In-context learning as working memory, fine-tuning as long-term memory): 10.5281/zenodo.20145218; Paper 4 (Same content, wider track): 10.5281/zenodo.20059859; Paper 6 (Matched friction under hysteresis): 10.5281/zenodo.20059863; Paper 7 (Forward-modelling in bounded race substrates): 10.5281/zenodo.20449154; Paper 13 (Operational Friction Theory): 10.5281/zenodo.20059876; Paper 16 (The Physics of Learning): 10.5281/zenodo.20416959. Series position. Paper 4B in the Friction Theory paper-series. Target venue: Cognitive Science (Wiley; primary) / Computational Brain & Behavior (secondary).
Tomas Pødenphant Lund (Wed,) studied this question.