What question did this study set out to answer?

This research aims to improve the efficiency of fuzz testing by developing a novel termination tool, F-800, that uses function clustering.

June 8, 2026

F-800: Agentic Fuzzing Termination via Function Clustering: Toward Smarter Early Stopping

Key Points

This research aims to improve the efficiency of fuzz testing by developing a novel termination tool, F-800, that uses function clustering.
Introduced F-800, a fuzzing termination tool based on function clustering.
Established relationships between function clusters and vulnerability distribution.
Monitored code coverage in clusters to determine adequacy of testing before termination.
F-800 reduced fuzzing time by 1.4–7.2 hours (5-30%) across configurations.
Maintained minimal bug loss with an average of 0.25 bugs per testing session.
Outperformed existing potential-vulnerability and function-coverage-based termination approaches.

Abstract

Fuzzing is a testing technique that generates a large number of inputs to cause program crashes. As software development accelerates and projects scale, the demand for fuzz testing in software assurance has increased. Performing comprehensive fuzz testing on all functions has become increasingly challenging and resource-intensive. Current methods for determining when to stop fuzz testing activities rely on metrics such as function coverage, potential vulnerability function coverage or crash count. However, these metrics fail to account for the scale of the functions under test. For example, function coverage may lead to excessive testing on non-critical functions, while vulnerability function coverage can result in premature termination if the estimated number of vulnerability functions is too low. This paper introduces a novel fuzzing termination tool, F-800, an agent based on function clustering. F-800 first establishes a relationship between function clusters and vulnerability distribution, then refines clusters using functional summaries. It subsequently monitors code coverage of each cluster to determine whether each function cluster has been sufficiently tested. The fuzzing campaign terminates once neither the coverage of function clusters nor the number of crash occurrences within specific clusters continues to increase. Our experiments on eight function libraries demonstrate that F-800 significantly improves testing efficiency, reducing fuzzing time by 1.4–7.2 hours (5-30%) across different configurations while maintaining minimal bug loss (averaging 0.25 bugs), outperforming existing approaches such as potential-vulnerability or function-coverage-based methods.

Mark Helpful

Bookmark

Relay