What question did this study set out to answer?

The aim is to develop a method for extracting readable and executable sparse circuits from trained neural network weights.

April 1, 2026Open Access

Neural Decompilation: Extracting Verified Sparse Circuits from Transformer Weights

Key Points

The aim is to develop a method for extracting readable and executable sparse circuits from trained neural network weights.
Introduced neural decompilation technique to extract circuits
Evaluated on 13 RNN classification tasks for accuracy
Used Kani model checker for formal verification of circuits
Analyzed entropy of layer-0 K projections in production LLMs
Examined a TinyLlama attention head for multi-script classification
Achieved perfect accuracy on 13 RNN classification tasks
6 circuits were formally verified across all inputs
Layer-0 K projections identified as entropy outliers
Sparse circuit in TinyLlama (2.7% of weights) demonstrated necessity and functional faithfulness
Replicated findings in Qwen2.5-0.5B architecture with similar Fisher scores

Abstract

We introduce neural decompilation, a method that extracts readable, executable sparse circuits from trained neural network weights. On 13 RNN classification tasks, decompiled circuits achieve perfect accuracy with 6 formally verified by the Kani model checker across all possible inputs. Applied to production LLMs, we discover that layer-0 K projections are entropy outliers (4.3-7.3 sigma below cross-layer mean) containing discrete attention routing circuits absent in deeper layers. One TinyLlama attention head implements a multi-script classifier whose sparse circuit (2.7% of weights) is causally necessary (ablation Fisher 5.94 to 0.00), functionally faithful (interchange intervention KL = 0.00045, top-1 agreement 97.6%), and cross-architectural (replicated in Qwen2.5-0.5B with Fisher = 5.44). Code: https://github.com/thebasedcapital/neural-decompile

Neural Decompilation: Extracting Verified Sparse Circuits from Transformer Weights

Key Points

Abstract

Cite This Study