This paper presents evidence that the primary barrier to reliable fact recall in sub-frontier language models deployed with system prompts is positional attention bias — the lost-in-the-middle phenomenon. In a systematic evaluation across five models and two model families, we discovered that restructuring the system prompt to place critical facts at the beginning and end raises a 14-billion parameter model's score from 5. 7/10 to 7. 5/10 on a seven-dimension verification battery, without any modification to model weights. The paper further describes Ground Truth Engineering, a systematic behavioural rule methodology that raised a 32B model from 6. 2/10 to 9. 4/10 with cold-restart persistence, and an automatic correction persistence pipeline. These findings enable sovereign AI deployment on consumer hardware costing under 500.
Farah Jaber (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: