skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 24, 2025

Title: Worst-Case Input Generation for Concurrent Programs under Non-Monotone Resource Metrics
Worst-case input generation aims to automatically generate inputs that exhibit the worst-case performance of programs. It has several applications, and can, for example, detect vulnerabilities to denial-of-service (DoS) attacks. However, it is non-trivial to generate worst-case inputs for concurrent programs, particularly for resources like memory where the peak cost depends on how processes are scheduled. This article presents the first sound worst-case input generation algorithm for concurrent programs under non-monotone resource metrics like memory. The key insight is to leverage resource-annotated session types and symbolic execution. Session types describe communication protocols on channels in process calculi. Equipped with resource annotations, resource-annotated session types not only encode cost bounds but also indicate how many resources can be reused and transferred between processes. This information is critical for identifying a worst-case execution path during symbolic execution. The algorithm is sound: if it returns any input, it is guaranteed to be a valid worst-case input. The algorithm is also relatively complete: as long as resource-annotated session types are sufficiently expressive and the background theory for SMT solving is decidable, a worst-case input is guaranteed to be returned. A simple case study of a web server's memory usage demonstrates the utility of the worst-case input generation algorithm.  more » « less
Award ID(s):
1845514 2007784
PAR ID:
10566222
Author(s) / Creator(s):
;
Publisher / Repository:
LMCS
Date Published:
Journal Name:
Logical Methods in Computer Science
Volume:
Volume 20, Issue 4
ISSN:
1860-5974
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Concolic testing combines concrete execution with symbolic execution along the executed path to automatically generate new test inputs that exercise program paths and deliver high code coverage during testing. The GKLEE tool uses this approach to expose data races in CUDA programs written for execution of GPGPUs. In programs employing concurrent dynamic data structures, automatic generation of data structures with appropriate shapes that cause threads to follow selected, possibly divergent, paths is a challenge. Moreover, a single non-conflicting data structure must be generated for multiple threads, that is, a single shape must be found that simultaneously causes all threads to follow their respective chosen paths. When an execution exposes a bug (e.g., a data race), the generated data structure shape helps the programmer understand the cause of the bug. Because GKLEE does not permit pointers that construct dynamic data structures to be made symbolic, it cannot automatically generate data structures of different shapes and must rely on the user to write code that constructs them to exercise desired paths. We have developed DSGEN for automatically generating non-conflicting dynamic data structures with different shapes and integrated it with GKLEE to uncover and facilitate understanding of data races in programs that employ complex concurrent dynamic data structures. In comparison to GKLEE, DSGEN increases the number of races detected from 10 to 25 by automatically generating a total of 1,897 shapes in implementations of four complex concurrent dynamic data structures -- B-Tree, Hash-Array Mapped Trie, RRB-Tree, and Skip List. 
    more » « less
  2. We propose a symbolic execution method for programs that can draw random samples. In contrast to existing work, our method can verify randomized programs with unknown inputs and can prove probabilistic properties that universally quantify over all possible inputs. Our technique augments standard symbolic execution with a new class of probabilistic symbolic variables , which represent the results of random draws, and computes symbolic expressions representing the probability of taking individual paths. We implement our method on top of the KLEE symbolic execution engine alongside multiple optimizations and use it to prove properties about probabilities and expected values for a range of challenging case studies written in C++, including Freivalds’ algorithm, randomized quicksort, and a randomized property-testing algorithm for monotonicity. We evaluate our method against Psi, an exact probabilistic symbolic inference engine, and Storm, a probabilistic model checker, and show that our method significantly outperforms both tools. 
    more » « less
  3. Session types provide a formal type system to define and verify communication protocols between message-passing processes. In order to analyze randomized systems, recent works have extended session types with probabilistic type constructors. Unfortunately, all the proposed extensions only support constant probabilities which limits their applicability to real-world systems. Our work addresses this limitation by introducing probabilistic refinement session types which enable symbolic reasoning for concurrent probabilistic systems in a core calculus we call PReST. The type system is carefully designed to be a conservative extension of refinement session types and supports both probabilistic and regular choice type operators. We also implement PReST in a prototype which we use for validating probabilistic concurrent programs. The added expressive power leads to significant challenges, both in the meta theory and implementation of PReST, particularly with type checking: it requires reconstructing intermediate types for channels when type checking probabilistic branching expressions. The theory handles this by semantically quantifying refinement variables in probabilistic typing rules, a deviation from standard refinement type systems. The implementation relies on a bi-directional type checker that uses an SMT solver to reconstruct the intermediate types minimizing annotation overhead and increasing usability. To guarantee that probabilistic processes are almost-surely terminating, we integrate cost analysis into our type system to obtain expected upper bounds on recursion depth. We evaluate PReST on a wide variety of benchmarks from 4 categories: (i) randomized distributed protocols such as Itai and Rodeh's leader election, bounded retransmission, etc., (ii) parametric Markov chains such as random walks, (iii) probabilistic analysis of concurrent data structures such as queues, and (iv) distributions obtained by composing uniform distributions using operators like max, and sum. Our experiments show that the PReST type checker scales to large programs with sophisticated probabilistic distributions. 
    more » « less
  4. Just, René; Fraser, Gordon (Ed.)
    Starting with a random initial seed, fuzzers search for inputs that trigger bugs or vulnerabilities. However, fuzzers often fail to generate inputs for program paths guarded by restrictive branch conditions. In this paper, we show that by first identifying rare-paths in programs (i.e., program paths with path constraints that are unlikely to be satisfied by random input generation), and then, generating inputs/seeds that trigger rare-paths, one can improve the coverage of fuzzing tools. In particular, we present techniques 1) that identify rare paths using quantitative symbolic analysis, and 2) generate inputs that can explore these rare paths using path-guided concolic execution. We provide these inputs as initial seed sets to three state of the art fuzzers. Our experimental evaluation on a set of programs shows that the fuzzers achieve better coverage with the rare-path based seed set compared to a random initial seed. 
    more » « less
  5. Detecting regression bugs in software evolution, analyzing side-channels in programs and evaluating robustness in deep neural networks (DNNs) can all be seen as instances of differential software analysis, where the goal is to generate diverging executions of program paths. Two executions are said to be diverging if the observable program behavior differs, e.g., in terms of program output, execution time, or (DNN) classification. The key challenge of differential software analysis is to simultaneously reason about multiple program paths, often across program variants. This paper presents HyDiff, the first hybrid approach for differential software analysis. HyDiff integrates and extends two very successful testing techniques: Feedback-directed greybox fuzzing for efficient program testing and shadow symbolic execution for systematic program exploration. HyDiff extends greybox fuzzing with divergence-driven feedback based on novel cost metrics that take into account the control flow graph of the program. Furthermore HyDiff extends shadow symbolic execution by applying four-way forking in a systematic exploration and still having the ability to incorporate concrete inputs in the analysis. HyDiff applies divergence revealing heuristics based on resource consumption and control-flow information to efficiently guide the symbolic exploration, which allows its efficient usage beyond regression testing applications. We introduce differential metrics such as output, decision and cost difference, as well as patch distance, to assist the fuzzing and symbolic execution components in maximizing the execution divergence. We implemented our approach on top of the fuzzer AFL and the symbolic execution framework Symbolic PathFinder. We illustrate HyDiff on regression and side-channel analysis for Java bytecode programs, and further show how to use HyDiff for robustness analysis of neural networks. 
    more » « less