What question did this study set out to answer?

The study aims to explore challenges in decompiling machine code to identify software vulnerabilities when source code is absent.

January 22, 2026

Problem Issues in Using Large Language Models for Decompilation of Machine Code With Vulnerabilities

Key Points

The study aims to explore challenges in decompiling machine code to identify software vulnerabilities when source code is absent.
Examine the application of large language models in code decompilation.
Identify key problematic issues related to vulnerabilities in machine code.
Provide a practical example using DeepSeek-V3.2 for assembly code decompilation.
Identified issues include dataset incompleteness for rare architectures.
Highlighted lack of assurance that decompiled source code matches original machine code.
Described challenges with code sanitization, hallucinations, and obfuscated code restoration.

Abstract

This paper examines the problem of software vulnerabilities in the absence of source code. One way to counter them is by decompilation the machine (executable) code of programs. The paper considers the application of a relatively new technology, large language models, to the task of restoring pseudo-source code suitable for detecting and eliminating vulnerabilities. The paper identifies problematic issues in the subject area, such as the incompleteness of the dataset for rare processor architectures, the lack of a guarantee that the obtained source code is identical to the specified machine code, the sanitization of the recovered source code by fixing vulnerabilities, hallucinations in the code, and the difficulty of restoring obfuscated (including optimized) code. To substantiate and demonstrate the essence of each problematic issue, a practical example of decompilation assembly code functions using the widespread large language model DeepSeek-V3.2 is provided. The negative impact of these problematic issues on the final neutralization of vulnerabilities is also indicated.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Konstantin Izrailov

Journals

Scientific and analytical journal «Vestnik Saint-Petersburg university of State fire service of EMERCOM of Russia»

Problem Issues in Using Large Language Models for Decompilation of Machine Code With Vulnerabilities

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study