What question did this study set out to answer?

The research aims to compare vendor-sponsored and independent evidence on the productivity of AI coding tools.

April 23, 2026Open Access

No Silver Harness: AI Coding Tools, Vendor Narratives, and the Latest Fashion Cycle in Software Engineering

Key Points

The research aims to compare vendor-sponsored and independent evidence on the productivity of AI coding tools.
Utilized a four-dimensional matching framework for systematic comparison.
Examined vendor-affiliated studies versus independent empirical evidence.
Conducted a randomized controlled trial involving experienced developers.
Vendor studies report 21-56% productivity gains on isolated tasks.
Independent studies show neutral-to-negative outcomes in mature codebases.
Experienced developers were 19% slower using AI tools, highlighting a 43-percentage-point perception-reality gap.

Abstract

AI coding tools and "harness engineering"—the practice of designing constraintinfrastructure around AI agents—represent the latest iteration of a managementfashion cycle (Abrahamson 1996) that has recurred from CASE tools through Agile,DevOps, and microservices. Applying the four-dimensional matching frameworkdeveloped in a companion paper (Sophia 2026a), this paper presents a systematiccomparison of vendor-sponsored and independent empirical evidence on AI codingtool productivity. Vendor-affiliated studies consistently find 21–56% improvements onisolated, greenfield tasks using activity metrics; independent studies measuringsystem-level outcomes in mature codebases find neutral-to-negative results, with arandomized controlled trial showing experienced developers 19% slower with AI toolsdespite believing themselves 24% faster—a 43-percentage-point perception-realitygap that calls into question all self-report evidence. The paper documents theenterprise reality gap: the prerequisites for effective AI tool use—reliable automatedtesting, well-structured architecture, mature CI/CD, and sufficient engineeringcapability—are met by an estimated 1–3% of organizations. In the remaining 97–99%,AI tools function not as productivity enhancers but as vulnerability amplifiers,accelerating structural decay, propagating security vulnerabilities, degrading reviewcapacity, and eroding developing capability. The current cycle introduces a structuralnovelty unprecedented in prior fashion cycles: the entities promoting the constrainingdiscipline are the same entities selling the tools whose unreliability necessitates theconstraints.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper