The VDR arithmetic system represents every value as a triple Value, Denominator, Remainder where the Remainder slot can hold a callable function that produces exact rational values at any requested depth. This functional remainder mechanism — specified in HOWL-VDR-1-2026 and used throughout the system for transcendental evaluation — has implications for hardware that prior papers in the series did not explore. On the VDR-22 integer-native ASIC HOWL-VDR-22-2026, each Q335 Integer Unit already contains a 384-bit ALU with 1-2 cycle multiply and free power-of-two division via fixed wiring. This paper specifies a Functional Remainder Unit (FRU) that extends each QIU to evaluate functional remainders — Taylor recurrences, Newton iterations, and series summations — in hardware using the existing ALU, without round-tripping to the host processor. The FRU adds approximately 500,000 transistors per QIU (3.4% die area increase across 5,120 units) and enables three capabilities that the base VDR-22 chip cannot provide: hardware-native exact exponential softmax at competitive latency with float implementations (25-40 nanoseconds for 1,024 logits versus host-bound milliseconds without the FRU), continuous per-step remainder resolution during training that replaces periodic Q-basis reprojection stalls with microsecond-level maintenance, and complete Prolog unification over active VDR values carrying nonzero remainders. At single-query scale, the FRU does not change wall-clock latency — the language model forward pass at approximately 30 microseconds per command token dominates primitive execution at 1-100 nanoseconds by 300-30,000×. At datacenter scale with millions of concurrent sessions, the FRU eliminates host round-trips for remainder resolution, keeping the entire rule-driven execution path on the data-plane chip and removing the serialization bottleneck that would otherwise limit throughput as accumulated Prolog rules handle an increasing fraction of work autonomously. The paper specifies the FRU microarchitecture, traces the full inference chain with adaptive precision, analyzes the datacenter throughput implications, and identifies the boundary between what the FRU changes (capability and throughput at scale) and what it does not (single-query latency).
Geoffrey Howland (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: