Frontier AI coding models now resolve the large majority of curated bug-fix benchmarks, yet independent evaluationsthat approach real enterprise conditions report dramatically lower success and, in at least one controlled trial, netnegative productivity for experienced developers. We argue this gap is not a capability gap. The evidence—acrossfeature-implementation benchmarks, long-context degradation studies, the documented failure of retrieval for code,and the independent convergence of every major vendor on structured instruction files—points to a single bindingconstraint: context. Models fail not because they cannot write code but because the right information does not reachthem at the right time, in the right structure, within a finite attention budget. We name the missing layer contextorchestration: the systematic engineering of what reaches a model, when, and in what form, governed and versionedalongside the codebase itself. This paper states the problem rigorously, diagnoses its mechanism, situates the economicstakes in Latin America, and presents the conceptual shape of a governed context layer—its founding principles,the harness-engineering rationale for why it works, and its position in the stack—without disclosing the proprietaryimplementation. We close with an honest evaluation posture and the limitations that remain open.
Santiago Ramirez (Tue,) studied this question.