The VDR-LLM-Prolog system replaces floating-point arithmetic with exact integer computation. The immediate objection is performance: integer arithmetic on 100-digit numbers must be slower than hardware-accelerated floating-point on 16-bit or 32-bit values. This paper demonstrates that the objection confuses per-operation cost with per-prompt cost. A conventional language model spends thousands of tokens — each requiring a full forward pass through billions of floating-point parameters — on infrastructure work that VDR handles through exact primitive calls costing a few hundred integer operations each. VDR-15 established that the token reduction is 85 to 97 percent. This paper establishes that the integer arithmetic executing those primitives maps efficiently to GPU hardware, that the wider operands are offset by the massive parallelism of modern GPUs, and that several architectural properties of VDR — fixed-frame regularity, grammar-constrained decode, indexed knowledge base scans, and frontier-based Prolog execution — produce GPU utilization patterns that are structurally superior to the irregular, attention-dominated workloads of conventional language model inference. The complete GPU mapping is specified in the supplementary technical specification. This paper explains why it works, what the performance characteristics are, and where the actual bottlenecks lie.
Geoffrey Howland (Fri,) studied this question.