skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Toward Tool-Independent Summaries for Symbolic Execution
We introduce a new symbolic reflection API for implementing tool-independent summaries for the symbolic execution of C programs. We formalise the proposed API as a symbolic semantics and extend two state-of-the-art symbolic execution tools with support for it. Using the proposed API, we implement 67 tool-independent symbolic summaries for a total of 26 libc functions. Furthermore, we present SumBoundVerify, a fully automatic summary validation tool for checking the bounded correctness of the symbolic summaries written using our symbolic reflection API. We use SumBoundVerify to validate 37 symbolic summaries taken from 3 state-of-the-art symbolic execution tools, angr, Binsec and Manticore, detecting a total of 24 buggy summaries.  more » « less
Award ID(s):
1718713
PAR ID:
10465440
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ECOOP
Volume:
LIPIcs
Issue:
263
Page Range / eLocation ID:
24:1-24:25
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In spite of decades of research in bug detection tools, there is a surprising dearth of ground-truth corpora that can be used to evaluate the efficacy of such tools. Recently, systems such as LAVA and EvilCoder have been proposed to automatically inject bugs into software to quickly generate large bug corpora, but the bugs created so far differ from naturally occurring bugs in a number of ways. In this work, we propose a new automated bug injection system, Apocalypse, that uses formal techniques—symbolic execution, constraint-based program synthesis and model counting—to automatically inject fair (can potentially be discovered by current bug-detection tools), deep (requiring a long sequence of dependencies to be satisfied to fire), uncorrelated (each bug behaving independent of others), reproducible (a trigger input being available) and rare (can be triggered by only a few program inputs) bugs in large software code bases. In our evaluation, we inject bugs into thirty Coreutils programs as well as the TCAS test suite. We find that bugs synthesized by Apocalypse are highly realistic under a variety of metrics, that they do not favor a particular bug-finding strategy (unlike bugs produced by LAVA), and that they are more difficult to find than manually injected bugs, requiring up around 240× more tests to discover with a state-of-the-art symbolic execution tool. 
    more » « less
  2. Nadel, Alexander; Rozier, Kristin Yvonne (Ed.)
    Symbolic execution is a powerful verification tool for hardware designs, in particular for security validation. However, symbolic execution suffers from the path explosion problem in which the number of paths to explore grows exponentially with the number of branches in the design. We introduce a new approach, piecewise composition, which leverages the modular structure of hardware to transfer the work of path exploration to SMT solvers. Piecewise composition works by recognizing that independent parts of a design can each be explored once, and the exploration reused. A hardware design with N independent always blocks and at most b branch points per block will require exploration of O((2^b)N) paths in a single clock cycle with our approach compared to O(2^(bN)) paths using traditional symbolic execution. We present Sylvia, a symbolic execution engine implementing piecewise composition. The engine operates directly over RTL without requiring translation to a netlist or software simulation. We evaluate our tool on multiple open-source SoC and CPU designs, including the OR1200 and PULPissimo RISC-V SoC. The piecewise composition technique reduces the number of paths explored by an order of magnitude and reduces the runtime by 97% compared to our baseline. Using 84 properties from the security literature we find assertion violations in open-source designs that traditional model checking and formal verification tools do not find. 
    more » « less
  3. Symbolic execution is an automated test input generation technique that models individual program paths as logical constraints. However, the realism of concrete test inputs generated by SMT solvers often comes into question. Existing symbolic execution tools only seek arbitrary solutions for given path constraints. These constraints do not incorporate the naturalness of inputs that observe statistical distributions, range constraints, or preferred string constants. This results in unnatural-looking inputs that fail to emulate real-world data. In this paper, we extend symbolic execution with consideration for incorporating naturalness. Our key insight is that users typically understand the semantics of program inputs, such as the distribution of height or possible values of zipcode, which can be leveraged to advance the ability of symbolic execution to produce natural test inputs. We instantiate this idea in NaturalSym, a symbolic execution-based test generation tool for data-intensive scalable computing (DISC) applications. NaturalSym generates natural-looking data that mimics real-world distributions by utilizing user-provided input semantics to drastically enhance the naturalness of inputs, while preserving strong bug-finding potential. On DISC applications and commercial big data test benchmarks, NaturalSym achieves a higher degree of realism —as evidenced by a perplexity score 35.1 points lower on median, and detects 1.29× injected faults compared to the state-of-the-art symbolic executor for DISC, BigTest. This is because BigTest draws inputs purely based on the satisfiability of path constraints constructed from branch predicates, while NaturalSym is able to draw natural concrete values based on user-specified semantics and prioritize using these values in input generation. Our empirical results demonstrate that NaturalSym finds injected faults 47.8× more than NaturalFuzz (a coverage-guided fuzzer) and 19.1× more than ChatGPT. Meanwhile, TestMiner (a mining-based approach) fails to detect any injected faults. NaturalSym is the first symbolic executor that combines the notion of input naturalness in symbolic path constraints during SMT-based input generation. We make our code available at https://github.com/UCLA-SEAL/NaturalSym. 
    more » « less
  4. With the rise of increasingly advanced reverse engineering technique, especially more scalable symbolic execution tools, software obfuscation faces great challenges. Branch conditions contain important control flow logic of a program. Adversaries can use powerful program analysis tools to collect sensitive program properties and recover a pro- gram’s internal logic, stealing intellectual properties from the original owner. In this paper, we propose a novel control obfuscation technique that uses lambda calculus to hide the original computation semantics and makes the original program more obscure to understand and re- verse engineer. Our obfuscator replaces the conditional instructions with lambda calculus function calls that simulate the same behavior with a more complicated execution model. Our experiment result shows that our obfuscation method can protect sensitive branch conditions from state- of-the-art symbolic execution techniques, with only modest overhead. 
    more » « less
  5. Extensive research has been conducted to explore cryptographic API misuse in Java. However, despite the tremendous popularity of the Python language, uncovering similar issues has not been fully explored. The current static code analysis tools for Python are unable to scan the increasing complexity of the source code. This limitation decreases the analysis depth, resulting in more undetected cryptographic misuses. In this research, we propose Cryptolation, a Static Code Analysis (SCA) tool that provides security guarantees for complex Python cryptographic code. Most existing analysis tools for Python solely focus on specific Frameworks such as Django or Flask. However, using a SCA approach, Cryptolation focuses on the language and not any framework. Cryptolation performs an inter-procedural data-flow analysis to handle many Python language features through variable inference (statically predicting what the variable value is) and SCA. Cryptolation covers 59 Python cryptographic modules and can identify 18 potential cryptographic misuses that involve complex language features. In this paper, we also provide a comprehensive analysis and a state-of-the-art benchmark for understanding the Python cryptographic Application Program Interface (API) misuses and their detection. Our state-of-the-art benchmark PyCryptoBench includes 1,836 Python cryptographic test cases that cover both 18 cryptographic rules and five language features. PyCryptoBench also provides a framework for evaluating and comparing different cryptographic scanners for Python. To evaluate the performance of our proposed cryptographic Python scanner, we evaluated Cryptolation against three other state-of-the-art tools: Bandit, Semgrep, and Dlint. We evaluated these four tools using our benchmark PyCryptoBench and manual evaluation of (four Top-Ranked and 939 Un-Ranked) real-world projects. Our results reveal that, overall, Cryptolation achieved the highest precision throughout our testing; and the highest accuracy on our benchmark. Cryptolation had 100% precision on PyCryptoBench, and the highest precision on real-world projects. 
    more » « less