skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: One Fuzzing Strategy to Rule Them All
Coverage-guided fuzzing has become mainstream in fuzzing to automatically expose program vulnerabilities. Recently, a group of fuzzers are proposed to adopt a random search mechanism namely Havoc, explicitly or implicitly, to augment their edge exploration. However, they only tend to adopt the default setup of Havoc as an implementation option while none of them attempts to explore its power under diverse setups or inspect its rationale for potential improvement. In this paper, to address such issues, we conduct the first empirical study on Havoc to enhance the understanding of its characteristics. Specifically, we first find that applying the default setup of Havoc to fuzzers can significantly improve their edge coverage performance. Interestingly, we further observe that even simply executing Havoc itself without appending it to any fuzzer can lead to strong edge coverage performance and outperform most of our studied fuzzers. Moreover, we also extend the execution time of Havoc and find that most fuzzers can not only achieve significantly higher edge coverage, but also tend to perform similarly (i.e., their performance gaps get largely bridged). Inspired by the findings, we further propose Havoc𝑀𝐴𝐵, which models the Havoc mutation strategy as a multi-armed bandit problem to be solved by dynamically adjusting the mutation strategy. The evaluation result presents that Havoc𝑀𝐴𝐵 can significantly increase the edge coverage by 11.1% on average for all the benchmark projects compared with Havoc and even slightly outperform state-of-the-art QSYM which augments its computing resource by adopting three parallel threads. We further execute Havoc𝑀𝐴𝐵 with three parallel threads and result in 9% higher average edge coverage over QSYM upon all the benchmark projects  more » « less
Award ID(s):
2131943 2141474
PAR ID:
10318076
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the International Conference on Software Engineering
ISSN:
1819-3781
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Coverage-guided fuzzing's aggressive, high-volume testing has helped reveal tens of thousands of software security flaws. While executing billions of test cases mandates fast code coverage tracing, the nature of binary-only targets leads to reduced tracing performance. A recent advancement in binary fuzzing performance is Coverage-guided Tracing (CGT), which brings orders-of-magnitude gains in throughput by restricting the expense of coverage tracing to only when new coverage is guaranteed. Unfortunately, CGT suits only a basic block coverage granularity---yet most fuzzers require finer-grain coverage metrics: edge coverage and hit counts. It is this limitation which prohibits nearly all of today's state-of-the-art fuzzers from attaining the performance benefits of CGT.This paper tackles the challenges of adapting CGT to fuzzing's most ubiquitous coverage metrics. We introduce and implement a suite of enhancements that expand CGT's introspection to fuzzing's most common code coverage metrics, while maintaining its orders-of-magnitude speedup over conventional always-on coverage tracing. We evaluate their trade-offs with respect to fuzzing performance and effectiveness across 12 diverse real-world binaries (8 open- and 4 closed-source). On average, our coverage-preserving CGT attains near-identical speed to the present block-coverage-only CGT, UnTracer; and outperforms leading binary- and source-level coverage tracers QEMU, Dyninst, RetroWrite, and AFL-Clang by 2--24x, finding more bugs in less time. 
    more » « less
  2. While many real-world programs are shipped with configurations to enable/disable functionalities, fuzzers have mostly been applied to test single configurations of these programs. In this work, we first conduct an empirical study to understand how program configurations affect fuzzing performance. We find that limiting a campaign to a single configuration can result in failing to cover a significant amount of code. We also observe that different program configurations contribute differing amounts of code coverage, challenging the idea that each one can be efficiently fuzzed individually. Motivated by these two observations, we propose ConfigFuzz , which can fuzz configurations along with normal inputs. ConfigFuzz transforms the target program to encode its program options within part of the fuzzable input, so existing fuzzers’ mutation operators can be reused to fuzz program configurations. We instantiate ConfigFuzz on six configurable, common fuzzing targets, and integrate their executions in FuzzBench. In our evaluation, ConfigFuzz outperforms two baseline fuzzers in four targets, while the results are mixed in the other targets due to program size and configuration space. We also analyze the options fuzzed by ConfigFuzz and how they affect the performance. 
    more » « less
  3. null (Ed.)
    Fuzz testing, or fuzzing, has become one of the de facto standard techniques for bug finding in the software industry. In general, fuzzing provides various inputs to the target program with the goal of discovering unhandled exceptions and crashes. In business sectors where the time budget is limited, software vendors often launch many fuzzing instances in parallel as a common means of increasing code coverage. However, most of the popular fuzzing tools — in their parallel mode — naively run multiple instances concurrently, without elaborate distribution of workload. This can lead different instances to explore overlapped code regions, eventually reducing the benefits of concurrency. In this paper, we propose a general model to describe parallel fuzzing. This model distributes mutually-exclusive but similarly-weighted tasks to different instances, facilitating concurrency and also fairness across instances. Following this model, we develop a solution, called AFL-EDGE, to improve the parallel mode of AFL, considering a round of mutations to a unique seed as a task and adopting edge coverage to define the uniqueness of a seed. We have implemented AFL-EDGE on top of AFL and evaluated the implementation with AFL on 9 widely used benchmark programs. It shows that AFL-EDGE can benefit the edge coverage of AFL. In a 24-hour test, the increase of edge coverage brought by AFL-EDGE to AFL ranges from 9.5% to 10.2%, depending on the number of instances. As a side benefit, we discovered 14 previously unknown bugs. 
    more » « less
  4. Kernel fuzzers rely heavily on program mutation to automatically generate new test programs based on existing ones. In particular, program mutation can alter the test's control and data flow inside the kernel by inserting new system calls, changing the values of call arguments, or performing other program mutations. However, due to the complexity of the kernel code and its user-space interface, finding the effective mutation that can lead to the desired outcome such as increasing the coverage and reaching a target code location is extremely difficult, even with the widespread use of manually-crafted heuristics. This work proposes Snowplow, a kernel fuzzer that uses a learned white-box test mutator to enhance test mutation. The core of Snowplow is an efficient machine learning model that can learn to predict promising mutations given the test program to mutate, its kernel code coverage, and the desired coverage. Snowplow is demonstrated on argument mutations of the kernel tests, and evaluated on recent Linux kernel releases. When fuzzing the kernels for 24 hours, Snowplow shows a significant speedup of discovering new coverage (4.8x~5.2x) and achieves higher overall coverage (7.0%~8.6%). In a 7-day fuzzing campaign, Snowplow discovers 86 previously-unknown crashes. Furthermore, the learned mutator is shown to accelerate directed kernel fuzzing by reaching 19 target code locations 8.5x faster and two additional locations that are missed by the state-of-the-art directed kernel fuzzer. 
    more » « less
  5. Greybox fuzzing has made impressive progress in recent years, evolving from heuristics-based random mutation to solving individual branch constraints. However, they have difficulty solving path constraints that involve deeply nested conditional statements, which are common in image and video decoders, network packet analyzers, and checksum tools. We propose an approach for addressing this problem. First, we identify all the control flow-dependent conditional statements of the target conditional statement. Next, we select the taint flow-dependent conditional statements. Finally, we use three strategies to find an input that satisfies all conditional statements simultaneously. We implemented this approach in a tool called Matryoshka and compared its effectiveness on 13 open source programs with other state-of-the-art fuzzers. Matryoshka achieved significantly higher cumulative line and branch coverage than AFL, QSYM, and Angora. We manually classified the crashes found by Matryoshka into 41 unique new bugs and obtained 12 CVEs. Our evaluation demonstrates the key technique contributing to Matryoshka's impressive performance: among the nesting constraints of a target conditional statement, Matryoshka collects only those that may cause the target unreachable, which greatly simplifies the path constraint that it has to solve. 
    more » « less