Symbolic execution is a popular software testing technique that can systematically examine program code to find bugs. Owing to the prevalence of software vulnerabilities, symbolic execution has been extensively used to detect software vulnerabilities. A major challenge of using symbolic execution to find complicated vulnerabilities such as use-after-free is to not only direct symbolic execution to explore relevant program paths but also explore the paths in a specific order. In this paper, we describe a targeted symbolic execution, called UAFDetect, for finding use-after-free vulnerabilities efficiently. UAFDetect guides symbolic execution to focus on paths that are likely to cause a use-after-free by pruning paths that are unlikely or infeasible to cause that. It uses dynamic typestate analysis to identify unlikely paths and static control flow analysis to identify infeasible paths. UAFDetect performs typestate analysis to detect the occurrence of use-after-free vulnerabilities. Upon the detection of a vulnerability, UAFDetect generates an exploit to trigger the vulnerability. We develop the prototype of UAFDetect and evaluate it on real-world use-after-free vulnerabilities. The evaluation demonstrates that UAFDetect can find use-after-free vulnerabilities effectively and efficiently.
more »
« less
Teaching with angr: A Symbolic Execution Curriculum and CTF
Symbolic execution is an essential tool in modern program analysis and vulnerability discovery. The technique is used to both find and fix vulnerabilities as well as to identify and exploit them. In order to ensure that symbolic execution tools are used more for the former, rather than the latter, we describe a curriculum and a set of scaffolded, polymorphic, “capture-the-flag” (CTF) exercises that have been developed to help students learn and utilize the technique to help ensure the software they produce is secure.
more »
« less
- Award ID(s):
- 1821841
- PAR ID:
- 10292660
- Date Published:
- Journal Name:
- USENIX Advances in Security Education
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Symbolic execution is a powerful program analysis and testing technique. Symbolic execution engines are usually implemented as interpreters, and the induced interpretation over-head can dramatically inhibit performance. Alternatively, implementation choices based on instrumentation provide a limited ability to transform programs. However, the use of compilation and code generation techniques beyond simple instrumentation remains underexplored for engine construction, leaving potential performance gains untapped. In this paper, we show how to tap some of these gains using sophisticated compilation techniques: We present Gensym, an optimizing symbolic-execution compiler that generates symbolic code which explores paths and generates tests in parallel. The key insight of GensYmis to compile symbolic execution tasks into cooperative concurrency via continuation-passing style, which further enables efficient parallelism. The design and implementation of Gensym is based on partial evaluation and generative programming techniques, which make it high-level and performant at the same time. We compare the performance of Gensym against the prior symbolic-execution compiler LLSC and the state-of-the-art symbolic interpreter KLEE. The results show an average 4.6× speedup for sequential execution and 9.4× speedup for parallel execution on 20 benchmark programs.more » « less
-
Concolic execution is a powerful technique in software testing, as it can systematically explore the code paths and is capable of traversing complex branches. It combines concrete execution for environment modeling and symbolic execution for path exploration. While significant research efforts in concolic execution have been directed toward the improvement of symbolic execution and constraint solving, our study pivots toward the often overlooked yet most common aspect: concrete execution. Our analysis shows that state-of-the-art binary concolic executors have largely overlooked the overhead in the execution of concrete instructions. In light of this observation, we propose optimizations to make the common (concrete) case fast. To validate this idea, we develop the prototype, SYMFIT, and evaluate it on standard benchmarks and realworld applications. The results showed that the performance of pure concrete execution is much faster than the baseline SYMQEMU, and is comparable to the vanilla QEMU. Moreover, we showed that the fast symbolic tracing capability of SYMFIT can significantly improve the efficiency of crash deduplication.more » « less
-
We propose a symbolic execution method for programs that can draw random samples. In contrast to existing work, our method can verify randomized programs with unknown inputs and can prove probabilistic properties that universally quantify over all possible inputs. Our technique augments standard symbolic execution with a new class of probabilistic symbolic variables , which represent the results of random draws, and computes symbolic expressions representing the probability of taking individual paths. We implement our method on top of the KLEE symbolic execution engine alongside multiple optimizations and use it to prove properties about probabilities and expected values for a range of challenging case studies written in C++, including Freivalds’ algorithm, randomized quicksort, and a randomized property-testing algorithm for monotonicity. We evaluate our method against Psi, an exact probabilistic symbolic inference engine, and Storm, a probabilistic model checker, and show that our method significantly outperforms both tools.more » « less
-
Symbolic execution is an automated test input generation technique that models individual program paths as logical constraints. However, the realism of concrete test inputs generated by SMT solvers often comes into question. Existing symbolic execution tools only seek arbitrary solutions for given path constraints. These constraints do not incorporate the naturalness of inputs that observe statistical distributions, range constraints, or preferred string constants. This results in unnatural-looking inputs that fail to emulate real-world data. In this paper, we extend symbolic execution with consideration for incorporating naturalness. Our key insight is that users typically understand the semantics of program inputs, such as the distribution of height or possible values of zipcode, which can be leveraged to advance the ability of symbolic execution to produce natural test inputs. We instantiate this idea in NaturalSym, a symbolic execution-based test generation tool for data-intensive scalable computing (DISC) applications. NaturalSym generates natural-looking data that mimics real-world distributions by utilizing user-provided input semantics to drastically enhance the naturalness of inputs, while preserving strong bug-finding potential. On DISC applications and commercial big data test benchmarks, NaturalSym achieves a higher degree of realism —as evidenced by a perplexity score 35.1 points lower on median, and detects 1.29× injected faults compared to the state-of-the-art symbolic executor for DISC, BigTest. This is because BigTest draws inputs purely based on the satisfiability of path constraints constructed from branch predicates, while NaturalSym is able to draw natural concrete values based on user-specified semantics and prioritize using these values in input generation. Our empirical results demonstrate that NaturalSym finds injected faults 47.8× more than NaturalFuzz (a coverage-guided fuzzer) and 19.1× more than ChatGPT. Meanwhile, TestMiner (a mining-based approach) fails to detect any injected faults. NaturalSym is the first symbolic executor that combines the notion of input naturalness in symbolic path constraints during SMT-based input generation. We make our code available at https://github.com/UCLA-SEAL/NaturalSym.more » « less
An official website of the United States government

