skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Matryoshka: Fuzzing Deeply Nested Branches
Greybox fuzzing has made impressive progress in recent years, evolving from heuristics-based random mutation to solving individual branch constraints. However, they have difficulty solving path constraints that involve deeply nested conditional statements, which are common in image and video decoders, network packet analyzers, and checksum tools. We propose an approach for addressing this problem. First, we identify all the control flow-dependent conditional statements of the target conditional statement. Next, we select the taint flow-dependent conditional statements. Finally, we use three strategies to find an input that satisfies all conditional statements simultaneously. We implemented this approach in a tool called Matryoshka and compared its effectiveness on 13 open source programs with other state-of-the-art fuzzers. Matryoshka achieved significantly higher cumulative line and branch coverage than AFL, QSYM, and Angora. We manually classified the crashes found by Matryoshka into 41 unique new bugs and obtained 12 CVEs. Our evaluation demonstrates the key technique contributing to Matryoshka's impressive performance: among the nesting constraints of a target conditional statement, Matryoshka collects only those that may cause the target unreachable, which greatly simplifies the path constraint that it has to solve.  more » « less
Award ID(s):
1801751
PAR ID:
10156937
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the ACM Conference on Computer and Communications Security
ISSN:
1543-7221
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Parsons puzzles are jigsaw puzzles wherein students are given a program in scrambled order and tasked with reassembling the program in its correct order. Do students use program semantics when solving Parsons puzzles? The answer to this question has implications for the use of Parsons puzzles as a pedagogic tool. In order to answer this question, we considered semantics at the level of statements and control-flow. We analyzed the data collected by a Parsons puzzle tutor over 5 semesters and measured the extent to which students’ puzzle-solving behavior conformed with the use of statement-level and control-flow semantics. We found that students used statement-level semantics to assemble up to 73% of the lines in a puzzle and control-flow semantics to assemble up to 47% of the lines. They used statement-level semantics more than control-flow semantics and more on some puzzles than others. Whenever we found a significant difference between C++ and Java students, C++ students used semantics more than Java students. Finally, we did not find an increase in the use of semantics with in-creased practice. 
    more » « less
  2. Karunakaran, S. S.; Reed, Z.; Higgins, A. (Ed.)
    Future mathematics teachers must be able to interpret a wide range of mathematical statements, in particular conditional statements. Literature shows that even when students are familiar with conditional statements and equivalence to the contrapositive, identifying other equivalent and non-equivalent forms can be challenging. As a part of a larger grant to enhance and study prospective secondary teachers’ (PSTs’) mathematical knowledge for teaching proof, we analyzed data from 26 PSTs working on a task that required rewriting a conditional statement in several different forms and then determining those that were equivalent to the original statement. We identified three key strategies used to make sense of the various forms of conditional statements and to identify equivalent and non-equivalent forms: meaning making, comparing truth-values and comparing to known syntactic forms. The PSTs relied both on semantic meaning of the statements and on their formal logical knowledge to make their judgments. 
    more » « less
  3. We recently proposed inline tests for validating individual program statements; they allow developers to provide test inputs, expected outputs, and test oracles immediately after a target statement. But, existing code can have many target statements. So, automatic generation of inline tests is an important next step towards increasing their adoption. We propose ExLi, the first technique for automatically generating inline tests. ExLi extracts inline tests from unit tests; it first records all variable values at a target statement while executing unit tests. Then, ExLi uses those values as test inputs and test oracles in an initial set of generated inline tests. Target statements that are executed many times could have redundant initial inline tests. So, ExLi uses a novel coverage-then-mutants based reduction process to remove redundant inline tests. We implement ExLi for Java and use it to generate inline tests for 718 target statements in 31 open-source programs. ExLi reduces 17,273 initially generated inline tests to 905 inline tests. The final set of generated inline tests kills up to 25.1% more mutants on target statements than developer written and automatically generated unit tests. That is, ExLi generates inline tests that can improve the fault-detection capability of the test suites from which they are extracted. 
    more » « less
  4. We recently proposed inline tests for validating individual program statements; they allow developers to provide test inputs, expected outputs, and test oracles immediately after a target statement. But, existing code can have many target statements. So, automatic generation of inline tests is an important next step towards increasing their adoption. We propose ExLi, the first technique for automatically generating inline tests. ExLi extracts inline tests from unit tests; it first records all variable values at a target statement while executing unit tests. Then, ExLi uses those values as test inputs and test oracles in an initial set of generated inline tests. Target statements that are executed many times could have redundant initial inline tests. So, ExLi uses a novel coverage-and-mutation based reduction process to remove redundant inline tests. We implement ExLi for Java and use it to generate inline tests for 718 target statements in 31 open-source programs. ExLi reduces 17,273 initially generated inline tests to 905 inline tests. The final set of generated inline tests kills up to 25.1% more mutants than developer written and automatically generated unit tests. That is, ExLi generates inline tests that can improve the fault-detection capability of the test suites from which they are extracted. 
    more » « less
  5. We describe a fast, abstract method for reverse engineering (RE) field programmable gate array (FPGA) look-up-tables (LUTs). Our method has direct applications to hardware (HW) metering and FPGA fingerprinting, and our approach allows easy portability and application to most L UT based FPGAs. Unlike conventional RE methodologies that rely on vendor specific code (like Xilinx XDL), tools, configuration files, components, etc., our methodology is not dependent on any specific FPGA or FPGA computer aided design (CAD) tool. We use generic hardware description language (HDL) code based on specially connected CASE statements to program the L UTs on a target FPGA. Our specially connected CASE statements allow us to guide placement of L UT functions on successive synthesis runs. This enables us to quickly determine which bits in the FPGA 's configuration file match to FPGA L UT bits. After we know which bits are L UT bits, we can go further and match specific LUT bits to specific bits in the configuration file, thereby creating a one-to-one mapping between every L UT memory cell and its matching bit in the configuration file. In this paper we present our CASE statement functions for performing one-to-one mapping of all FPGA L UT memory cell bits to specific configuration file bits. We have successfully applied our methods to several 7000 series Xilinx and Intel (Altera) FPGAs. 
    more » « less