ABSTRACT Fuzz testing has proven highly effective in detecting errors and vulnerabilities across a wide range of systems under test (SUTs). For SUTs that take programming languages as input, such as compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, their stability is crucial since these systems serve as the foundation of most software. The correctness of these foundational systems directly determines the reliability of upper layer software. However, existing general purpose fuzzers often neglect the guiding role of coverage information during fuzzing loops and rely on overly simplistic methods for automatic prompt generation. To address these issues, this paper proposes CoverFuzz, a coverage guided and general‐purpose fuzzing framework that supports multiple large models and enables fuzz testing across different programming languages, SUTs, and their respective characteristics. The key idea behind CoverFuzz is to employ a nonuniform coverage guided fuzzing loop, which not only maintains the overall efficiency of the fuzzing process but also effectively utilizes coverage information to explore untested regions of the target system. To realize CoverFuzz, we introduce two core techniques: an expert template‐based prompt generation method and a nonuniform coverage guided fuzzing loop. The former reduces the user's effort in configuring CoverFuzz and improves the quality of initial prompts, while the latter leverages coverage feedback to enhance the exploration of SUTs and facilitate bug discovery. We evaluated CoverFuzz on four SUTs that accept different input languages, including C, C++, Go, and SMT2. The experimental results demonstrate that CoverFuzz achieves higher code coverage than state of the art general purpose fuzzers across all four languages. Moreover, CoverFuzz discovered 47 bugs in systems such as GCC, Clang, CVC5, and Go, 18 of which were previously unknown.
Lin et al. (Fri,) studied this question.