What does this research mean for the field?

Utilizing Large Language Models to generate compilation options based on intermediate representation abstractions of historical bug-triggering test programs significantly improves compiler bug detection compared to random exploration and existing techniques. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to enhance compiler testing by generating effective compilation options using LLMs and historical test data.

May 25, 2026

OptFuzz: Enhancing Compiler Testing via LLM-Powered Compilation Option Generation

Puntos clave

The aim is to enhance compiler testing by generating effective compilation options using LLMs and historical test data.
Utilized large language models to extract historical bug-triggering test programs from diverse bug reports.
Employed intermediate representation for code abstraction to manage LLM context limitations.
Conducted extensive experiments on GCC and LLVM to evaluate bug detection effectiveness.
OptFuzz identified 64 new bugs in GCC and LLVM, with 53 confirmed or fixed.
Demonstrated superior bug detection compared to random exploration methods with significant reductions in overhead.
Proved that IR-based analysis enhances bug detection capabilities over using source code directly.

Resumen

Compiler testing is crucial, as compilers serve as foundational infrastructure in software development. Effective compiler testing necessitates not only the generation of diverse test programs but also the systematic specification of compilation options. While existing research predominantly emphasizes test program diversification, it largely overlooks the strategic selection of compilation options required for comprehensive testing. Compilers typically offer a wide range of fine-grained compilation options that allow for precise control over the compilation process, resulting in a vast combination space. Exhaustive enumeration of all option combinations is computationally infeasible, while the stochastic generation of conflictfree, semantically meaningful options presents significant methodological challenges. To address these limitations, we propose OptFuzz, an innovative compiler testing framework that harnesses the generative capabilities of Large Language Models (LLMs) alongside the effectiveness of historical bug-triggering test programs for a comprehensive exploration of the compilation space. OptFuzz leverages LLMs to extract historical bug-triggering test programs from diverse bug reports, which are demonstrated to be effective in uncovering compiler bugs. Therefore, it overcomes the limitations of existing regex-based extraction techniques. Subsequently, OptFuzz employs a code abstraction extraction method based on intermediate representation (IR) to tackle the constraints of LLM context length limitations. Finally, the extracted IR is fed into LLM to generate effective compilation options for compiler testing. Through extensive experiments on GCC and LLVM, OptFuzz demonstrated superior bug detection capability compared to the random compilation space exploration method and other existing technologies. Notably, OptFuzz discovered 64 new bugs in GCC and LLVM, with 53 confirmed or fixed, highlighting our method’s practical utility. The experimental outcomes also indicate that the IR-based analysis substantially decreases overhead and improves bug detection compared to direct utilization of source code.

Me gusta

Guardar

Cite This Study

Huang et al. (Sat,) studied this question.

synapsesocial.com/papers/6a13e78b0e02ee3982d32371 https://doi.org/https://doi.org/10.1145/3817602

Me gusta

Guardar