What question did this study set out to answer?

This research aims to improve neural decompilers by using a training paradigm that focuses on execution signals and compact behavioral explanations.

April 10, 2026Open Access

Training Axiomatic LLM Decompilers via Manifold Reconstruction

Key Points

This research aims to improve neural decompilers by using a training paradigm that focuses on execution signals and compact behavioral explanations.
Introduced a training approach incorporating execution-derived signals.
Applied a description-length bias based on Minimum Description Length principles.
Framed reverse engineering as a compact explanation search.
Encouraged the model to avoid overfitting to compiler noise.
Improved behavioral correctness in decompiled outputs.
Generated more syntactically and semantically valid source code.
Achieved consistent performance enhancement over existing neural decompilers.

Abstract

Abstract Current neural decompilers (e.g., HELIOS, LLM4Decompile) treat binary-to-source translation primarily as a text generation problem, which can yield outputs that are syntactically plausible but behaviorally incorrect or non-compilable. T This paper introduces a training paradigm for axiomatic decompilation that incorporates execution-derived signals and a description-length bias inspired by Minimum Description Length (MDL) and algorithmic information theory. By framing reverse engineering as a search for a compact explanation of observed behavior, the model is encouraged to ignore “compiler noise” and prefer simpler reconstructions that remain consistent with traces.

Training Axiomatic LLM Decompilers via Manifold Reconstruction

Key Points

Abstract

Cite This Study