skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: SyML: Guiding Symbolic Execution Toward Vulnerable States Through Pattern Learning
Exploring many execution paths in a binary program is essential to discover new vulnerabilities. Dynamic Symbolic Execution (DSE) is useful to trigger complex input conditions and enables an accurate exploration of a program while providing extensive crash replayability and semantic insights. However, scaling this type of analysis to complex binaries is difficult. Current methods suffer from the path explosion problem, despite many attempts to mitigate this challenge (e.g., by merging paths when appropriate). Still, in general, this challenge is not yet surmounted, and most bugs discovered through such techniques are shallow. We propose a novel approach to address the path explosion problem: A smart triaging system that leverages supervised machine learning techniques to replicate human expertise, leading to vulnerable path discovery. Our approach monitors the execution traces in vulnerable programs and extracts relevant features—register and memory accesses, function complexity, system calls—to guide the symbolic exploration. We train models to learn the patterns of vulnerable paths from the extracted features, and we leverage their predictions to discover interesting execution paths in new programs. We implement our approach in a tool called SyML, and we evaluate it on the Cyber Grand Challenge (CGC) dataset—a well-known dataset of vulnerable programs—and on 3 real-world Linux binaries. We show that the knowledge collected from the analysis of vulnerable paths, without any explicit prior knowledge about vulnerability patterns, is transferrable to unseen binaries, and leads to outperforming prior work in path prioritization by triggering more, and different, unique vulnerabilities.  more » « less
Award ID(s):
1704253
PAR ID:
10346374
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
RAID '21: 24th International Symposium on Research in Attacks, Intrusions and Defenses
Page Range / eLocation ID:
456 to 468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Detecting regression bugs in software evolution, analyzing side-channels in programs and evaluating robustness in deep neural networks (DNNs) can all be seen as instances of differential software analysis, where the goal is to generate diverging executions of program paths. Two executions are said to be diverging if the observable program behavior differs, e.g., in terms of program output, execution time, or (DNN) classification. The key challenge of differential software analysis is to simultaneously reason about multiple program paths, often across program variants. This paper presents HyDiff, the first hybrid approach for differential software analysis. HyDiff integrates and extends two very successful testing techniques: Feedback-directed greybox fuzzing for efficient program testing and shadow symbolic execution for systematic program exploration. HyDiff extends greybox fuzzing with divergence-driven feedback based on novel cost metrics that take into account the control flow graph of the program. Furthermore HyDiff extends shadow symbolic execution by applying four-way forking in a systematic exploration and still having the ability to incorporate concrete inputs in the analysis. HyDiff applies divergence revealing heuristics based on resource consumption and control-flow information to efficiently guide the symbolic exploration, which allows its efficient usage beyond regression testing applications. We introduce differential metrics such as output, decision and cost difference, as well as patch distance, to assist the fuzzing and symbolic execution components in maximizing the execution divergence. We implemented our approach on top of the fuzzer AFL and the symbolic execution framework Symbolic PathFinder. We illustrate HyDiff on regression and side-channel analysis for Java bytecode programs, and further show how to use HyDiff for robustness analysis of neural networks. 
    more » « less
  2. Nadel, Alexander; Rozier, Kristin Yvonne (Ed.)
    Symbolic execution is a powerful verification tool for hardware designs, in particular for security validation. However, symbolic execution suffers from the path explosion problem in which the number of paths to explore grows exponentially with the number of branches in the design. We introduce a new approach, piecewise composition, which leverages the modular structure of hardware to transfer the work of path exploration to SMT solvers. Piecewise composition works by recognizing that independent parts of a design can each be explored once, and the exploration reused. A hardware design with N independent always blocks and at most b branch points per block will require exploration of O((2^b)N) paths in a single clock cycle with our approach compared to O(2^(bN)) paths using traditional symbolic execution. We present Sylvia, a symbolic execution engine implementing piecewise composition. The engine operates directly over RTL without requiring translation to a netlist or software simulation. We evaluate our tool on multiple open-source SoC and CPU designs, including the OR1200 and PULPissimo RISC-V SoC. The piecewise composition technique reduces the number of paths explored by an order of magnitude and reduces the runtime by 97% compared to our baseline. Using 84 properties from the security literature we find assertion violations in open-source designs that traditional model checking and formal verification tools do not find. 
    more » « less
  3. Just, René; Fraser, Gordon (Ed.)
    Starting with a random initial seed, fuzzers search for inputs that trigger bugs or vulnerabilities. However, fuzzers often fail to generate inputs for program paths guarded by restrictive branch conditions. In this paper, we show that by first identifying rare-paths in programs (i.e., program paths with path constraints that are unlikely to be satisfied by random input generation), and then, generating inputs/seeds that trigger rare-paths, one can improve the coverage of fuzzing tools. In particular, we present techniques 1) that identify rare paths using quantitative symbolic analysis, and 2) generate inputs that can explore these rare paths using path-guided concolic execution. We provide these inputs as initial seed sets to three state of the art fuzzers. Our experimental evaluation on a set of programs shows that the fuzzers achieve better coverage with the rare-path based seed set compared to a random initial seed. 
    more » « less
  4. Symbolic execution is a popular software testing technique that can systematically examine program code to find bugs. Owing to the prevalence of software vulnerabilities, symbolic execution has been extensively used to detect software vulnerabilities. A major challenge of using symbolic execution to find complicated vulnerabilities such as use-after-free is to not only direct symbolic execution to explore relevant program paths but also explore the paths in a specific order. In this paper, we describe a targeted symbolic execution, called UAFDetect, for finding use-after-free vulnerabilities efficiently. UAFDetect guides symbolic execution to focus on paths that are likely to cause a use-after-free by pruning paths that are unlikely or infeasible to cause that. It uses dynamic typestate analysis to identify unlikely paths and static control flow analysis to identify infeasible paths. UAFDetect performs typestate analysis to detect the occurrence of use-after-free vulnerabilities. Upon the detection of a vulnerability, UAFDetect generates an exploit to trigger the vulnerability. We develop the prototype of UAFDetect and evaluate it on real-world use-after-free vulnerabilities. The evaluation demonstrates that UAFDetect can find use-after-free vulnerabilities effectively and efficiently. 
    more » « less
  5. Speculative execution side-channel vulnerabilities in micro-architecture processors have raised concerns about the security of Intel SGX. To understand clearly the security impact of this vulnerability against SGX, this paper makes the following studies: First, to demonstrate the feasibility of the attacks, we present SgxPectre Attacks (the SGX-variants of Spectre attacks) that exploit speculative execution side-channel vulnerabilities to subvert the confidentiality of SGX enclaves. We show that when the branch prediction of the enclave code can be influenced by programs outside the enclave, the control flow of the enclave program can be temporarily altered to execute instructions that lead to observable cache-state changes. An adversary observing such changes can learn secrets inside the enclave memory or its internal registers, thus completely defeating the confidentiality guarantee offered by SGX. Second, to determine whether real-world enclave programs are impacted by the attacks, we develop techniques to automate the search of vulnerable code patterns in enclave binaries using symbolic execution. Our study suggests that nearly any enclave program could be vulnerable to SgxPectre Attacks since vulnerable code patterns are available in most SGX runtimes (e.g., Intel SGX SDK, Rust-SGX, and Graphene-SGX). Third, we apply SgxPectre Attacks to steal seal keys and attestation keys from Intel signed quoting enclaves. The seal key can be used to decrypt sealed storage outside the enclaves and forge valid sealed data; the attestation key can be used to forge attestation signatures. For these reasons, SgxPectre Attacks practically defeat SGX's security protection. Finally, we evaluate Intel's existing countermeasures against SgxPectre Attacks and discusses the security implications. 
    more » « less