skip to main content


Title: SIFT: A Tool for Property Directed Symbolic Execution of Multithreaded Software
Analyzing multithreaded programs is notoriously hard due to the exponential number of thread interleavings. Although race detectors can help developers find and fix such bugs before the code is deployed, multithreaded code may still be buggy due to memory errors and assertion violations that are not due to race conditions. This paper presents a property directed symbolic execution of multithreaded code. Our approach, named SIFT, differs from previous work on detecting errors in multithreaded code by being property directed and by handling both memory safety and assertion checking that can be further customized by the user. SIFT can detect bugs that may or may not be due to data races, and works in an iterative way. In each step, it explores the state space using selective scheduling based on a set of interleaving points that have been inferred in the previous step. We have developed three partitioning strategies for improved effectiveness and performance. We have implemented SIFT on top of the KLEE symbolic execution engine and applied it to various real-world and academic benchmarks. SIFT could detect more vulnerabilities than a state-of-the-art memory vulnerability detector.  more » « less
Award ID(s):
1942235
NSF-PAR ID:
10436564
Author(s) / Creator(s):
Date Published:
Journal Name:
ICST'22
Page Range / eLocation ID:
433 to 443
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We describe and evaluate an extensible bug-finding tool, Sys, designed to automatically find security bugs in huge codebases, even when easy-to-find bugs have been already picked clean by years of aggressive automatic checking. Sys uses a two-step approach to find such tricky errors. First, it breaks down large---tens of millions of lines---systems into small pieces using user-extensible static checkers to quickly find and mark potential errorsites. Second, it uses user-extensible symbolic execution to deeply examine these potential errorsites for actual bugs. Both the checkers and the system itself are small (6KLOC total). Sys is flexible, because users must be able to exploit domain- or system-specific knowledge in order to detect errors and suppress false positives in real codebases. Sys finds many security bugs (51 bugs, 43 confirmed) in well-checked code---the Chrome and Firefox web browsers---and code that some symbolic tools struggle with---the FreeBSD operating system. Sys's most interesting results include: an exploitable, cash bountied CVE in Chrome that was fixed in seven hours (and whose patch was backported in two days); a trio of bountied bugs with a CVE in Firefox; and a bountied bug in Chrome's audio support. 
    more » « less
  2. null (Ed.)
    Compilers are generally not aware of the semantics of library-based parallel programming models such as MPI and OpenSHMEM, and hence are unable to detect programming errors related to their use. To alleviate this issue, we developed a custom static checker for OpenSHMEM programs based on LLVM’s Clang Static Analyzer framework (CSA). We leverage the Symbolic Execution engine of the core Static Analyzer framework and its path-sensitive analysis to check for bugs on all OpenSHMEM program paths. We have identified common programming mistakes in OpenSHMEM programs that are detectable at compile-time and provided checks for them in the analyzer. They cover: utilization of the right type of mem- ory (private vs. symmetric memory); safe/synchronized access to program data in the presence of asynchronous, one-sided communication; and double-free of memories allocated using OpenSHMEM memory allocation routines. Our experimental analysis showed that the static checker successfully detects bugs in OpenSHMEM code. 
    more » « less
  3. null (Ed.)
    Assertions are widely used for functional validation as well as coverage analysis for both software and hardware designs. Assertions enable runtime error detection as well as faster localization of errors. While there is a vast literature on both software and hardware assertions for monitoring functional scenarios, there is limited effort in utilizing assertions to monitor System-on-Chip (SoC) security vulnerabilities. We have identified common SoC security vulnerabilities and defined several classes of assertions to enable runtime checking of security vulnerabilities. A major challenge in assertion-based validation is how to activate the security assertions to ensure that they are valid. While existing test generation using model checking is promising, it cannot generate directed tests for large designs due to state space explosion. We propose an automated and scalable mechanism to generate directed tests using a combination of symbolic execution and concrete simulation of RTL models. Experimental results on diverse benchmarks demonstrate that the directed tests are able to activate security assertions non-vacuously. 
    more » « less
  4. null (Ed.)
    Intermittently-powered, energy-harvesting devices operate on energy collected from their environment and must operate intermittently as energy is available. Runtime systems for such devices often rely on checkpoints or redo-logs to save execution state between power cycles, causing arbitrary code regions to re-execute on reboot. Any non-idempotent program behavior—behavior that can change on each execution—can lead to incorrect results. This work investigates non-idempotent behavior caused by repeating I/O operations, not addressed by prior work. If such operations affect a control statement or address of a memory update, they can cause programs to take different paths or write to different memory locations on re-executions, resulting in inconsistent memory states. We provide the first characterization of input-dependent idempotence bugs and develop IBIS-S, a program analysis tool for detecting such bugs at compile time, and IBIS-D, a dynamic information flow tracker to detect bugs at runtime. These tools use taint propagation to determine the reach of input. IBIS-S searches for code patterns leading to inconsistent memory updates, while IBIS-D detects concrete memory inconsistencies. We evaluate IBIS on embedded system drivers and applications. IBIS can detect I/O-dependent idempotence bugs, giving few (IBIS-S) or no (IBIS-D) false positives and providing actionable bug reports. These bugs are common in sensor-driven applications and are not fixed by existing intermittent systems. 
    more » « less
  5. null (Ed.)
    As big data analytics become increasingly popular, data-intensive scalable computing (DISC) systems help address the scalability issue of handling large data. However, automated testing for such data-centric applications is challenging, because data is often incomplete, continuously evolving, and hard to know a priori. Fuzz testing has been proven to be highly effective in other domains such as security; however, it is nontrivial to apply such traditional fuzzing to big data analytics directly for three reasons: (1) the long latency of DISC systems prohibits the applicability of fuzzing: naïve fuzzing would spend 98% of the time in setting up a test environment; (2) conventional branch coverage is unlikely to scale to DISC applications because most binary code comes from the framework implementation such as Apache Spark; and (3) random bit or byte level mutations can hardly generate meaningful data, which fails to reveal real-world application bugs. We propose a novel coverage-guided fuzz testing tool for big data analytics, called BigFuzz. The key essence of our approach is that: (a) we focus on exercising application logic as opposed to increasing framework code coverage by abstracting the DISC framework using specifications. BigFuzz performs automated source to source transformations to construct an equivalent DISC application suitable for fast test generation, and (b) we design schema-aware data mutation operators based on our in-depth study of DISC application error types. BigFuzz speeds up the fuzzing time by 78 to 1477X compared to random fuzzing, improves application code coverage by 20% to 271%, and achieves 33% to 157% improvement in detecting application errors. When compared to the state of the art that uses symbolic execution to test big data analytics, BigFuzz is applicable to twice more programs and can find 81% more bugs. 
    more » « less