Single-Source-Single-Target Interleaved-Dyck Reachability via Integer Linear Programming

Li, Yuanbo; Zhang, Qirun; Reps, Thomas

doi:10.1145/3571228

An interleaved-Dyck (InterDyck) language consists of the interleaving of two or more Dyck languages, where each Dyck language represents a set of strings of balanced parentheses.InterDyck-reachability is a fundamental framework for program analyzers that simultaneously track multiple properly-matched pairs of actions such as call/return, lock/unlock, or write-data/read-data.Existing InterDyck-reachability algorithms are based on the well-known tabulation technique.

This paper presents a new perspective on solving InterDyck-reachability. Our key observation is that for the single-source-single-target InterDyck-reachability variant, it is feasible to summarize all paths from the source node to the target node based onpath expressions. Therefore, InterDyck-reachability becomes an InterDyck-path-recognition problem over path expressions. Instead of computing summary edges as in traditional tabulation algorithms, this new perspective enables us to express InterDyck-reachability as aparenthesis-countingproblem, which can be naturally formulated via integer linear programming (ILP).

We implemented our ILP-based algorithm and performed extensive evaluations based on two client analyses (a reachability analysis for concurrent programs and a taint analysis). In particular, we evaluated our algorithm against two types of algorithms: (1) the general all-pairs InterDyck-reachability algorithms based on linear conjunctive language (LCL) reachability and synchronized pushdown system (SPDS) reachability, and (2) two domain-specific algorithms for both client analyses. The experimental results are encouraging. Our algorithm achieves 1.42×, 28.24×, and 11.76× speedup for the concurrency-analysis benchmarks compared to all-pair LCL-reachability, SPDS-reachability, and domain-specific tools, respectively; 1.2×, 69.9×, and 0.98× speedup for the taint-analysis benchmarks. Moreover, the algorithm also provides precision improvements, particularly for taint analysis, where it achieves 4.55%, 11.1%, and 6.8% improvement, respectively.

More Like this