Saad-Falcon et al. (2025) introduced Intelligence Per Watt (IPW) as the critical metric for tracking AI efficiency: task accuracy divided by power consumed. Their longitudinal study documents 5.3x IPW improvement from 2023-2025, driven by model and hardware advances. This paper demonstrates that a stack of algorithmic efficiency optimizations — derived from a unified stochastic health monitoring framework — provides an additional multiplicative IPW improvement on top of whatever hardware is available. The core three-layer algorithmic stack (FlashAttention, run-level power metric allocation, and early exit) provides a combined 54x IPW improvement at sequence length 4,096 tokens, through three orthogonal mechanisms: per-operation memory bandwidth efficiency (FlashAttention, 2.86x), allocation efficiency reducing which operations occur (power metric inference, 5.18x), and depth efficiency reducing how many layers each operation uses (early exit, 1.56x). These layers are independent and compound multiplicatively. Applied on top of Saad-Falcon et al.'s 2025 hardware baseline, the combined IPW improvement is estimated at up to approximately 122x versus the 2023 baseline. The full stack including speculative decoding and quality-preserving layers reaches 70x algorithmic improvement alone. Critically, these algorithmic gains are available today on existing hardware — they do not require waiting for the next hardware generation. Keywords: intelligence per watt, IPW, energy efficiency, algorithmic efficiency, FlashAttention, power metric, early exit, speculative decoding, compute stack, Saad-Falcon These estimates will strike some readers as impossible. They are not. They represent a structured upper bound under partial independence assumptions — which is to say, the math works out this way even if we wish it were less dramatic. The headline number is large. We checked. It's still large. The purpose of this paper is not to assert realized gains, but to map a hypothesis space and identify where empirical validation is most valuable.
Cole Cantrell (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: