What question did this study set out to answer?

This research aims to identify and improve the experimental setups used to evaluate fuzz testing algorithms.

October 15, 2018Open Access

Evaluating Fuzz Testing

GKGeorge KleesUniversity of Maryland, College Park ARAndrew RuefUniversity of Maryland, College Park BCBenji CooperUniversity of Maryland, College Park

Key Points

This research aims to identify and improve the experimental setups used to evaluate fuzz testing algorithms.
Surveyed 32 recent fuzzing papers to assess their experimental evaluations.
Performed extensive experimental evaluations using an existing fuzzer.
Identified common problems leading to misleading results in the evaluations.
Highlighted deficiencies in the experimental evaluations across all reviewed papers.
Demonstrated that issues with existing evaluations lead to incorrect or misleading assessments.
Proposed guidelines for more robust experimental evaluations of fuzz testing.

Abstract

Fuzz testing has enjoyed great success at discovering security critical bugs in real software. Recently, researchers have devoted significant effort to devising new fuzzing techniques, strategies, and algorithms. Such new ideas are primarily evaluated experimentally so an important question is: What experimental setup is needed to produce trustworthy results? We surveyed the recent research literature and assessed the experimental evaluations carried out by 32 fuzzing papers. We found problems in every evaluation we considered. We then performed our own extensive experimental evaluation using an existing fuzzer. Our results showed that the general problems we found in existing experimental evaluations can indeed translate to actual wrong or misleading assessments. We conclude with some guidelines that we hope will help improve experimental evaluations of fuzz testing algorithms, making reported results more robust.

Ask AI

Helpful

Bookmark

View Full Paper