skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Analyzing Semantic Correctness with Symbolic Execution: A Case Study on PKCS#1 v1.5 Signature Verification
We discuss how symbolic execution can be used to not only find low-level errors but also analyze the semantic correctness of protocol implementations. To avoid manually crafting test cases, we propose a strategy of meta-level search, which leverages constraints stemmed from the input formats to automatically generate concolic test cases. Additionally, to aid root-cause analysis, we develop constraint provenance tracking (CPT), a mechanism that associates atomic sub-formulas of path constraints with their corresponding source level origins. We demonstrate the power of symbolic analysis with a case study on PKCS#1 v1.5 signature verification. Leveraging meta-level search and CPT, we analyzed 15 recent open-source implementations using symbolic execution and found semantic flaws in 6 of them. Further analysis of these flaws showed that 4 implementations are susceptible to new variants of the Bleichenbacher low-exponent RSA signature forgery. One implementation suffers from potential denial of service attacks with purposefully crafted signatures. All our findings have been responsibly shared with the affected vendors. Among the flaws discovered, 6 new CVEs have been assigned to the immediately exploitable ones.  more » « less
Award ID(s):
1657124
PAR ID:
10095802
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Network and Distributed Systems Security (NDSS) Symposium 2019
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. One of the most significant challenges in the field of software code auditing is the presence of vulnerabilities in software source code. Every year, more and more software flaws are discovered, either internally in proprietary code or publicly disclosed. These flaws are highly likely to be exploited and can lead to system compromise, data leakage, or denial of service. To create a large-scale machine learning system for function-level vulnerability identification, we utilized a sizable dataset of C and C++ open-source code containing millions of functions with potential buffer overflow exploits. We have developed an efficient and scalable vulnerability detection method based on neural network models that learn features extracted from the source codes. The source code is first converted into an intermediate representation to remove unnecessary components and shorten dependencies. We maintain the semantic and syntactic information using state-ofthe- art word embedding algorithms such as GloVe and fastText. The embedded vectors are subsequently fed into neural networks such as LSTM, BiLSTM, LSTM-Autoencoder, word2vec, BERT, and GPT-2 to classify the possible vulnerabilities. Furthermore, we have proposed a neural network model that can overcome issues associated with traditional neural networks. We have used evaluation metrics such as F1 score, precision, recall, accuracy, and total execution time to measure the performance. We have conducted a comparative analysis between results derived from features containing a minimal text representation and semantic and syntactic information. We have found that all neural network models provide higher accuracy when we use semantic and syntactic information as features. However, this approach requires more execution time due to the added complexity of the word embedding algorithm. Moreover, our proposed model provides higher accuracy than LSTM, BiLSTM, LSTM-Autoencoder, word2vec and BERT models, and the same accuracy as the GPT-2 model with greater efficiency. 
    more » « less
  2. We present SEIF, an exploratory methodology for information flow verification based on symbolic execution. SEIF begins with a statically built overapproximation of the information flow through a design and uses guided symbolic execution to provide a more precise picture of how information flows from a given set of security critical signals. SEIF can recognize and eliminate non-flows with high precision and for the true flows can find the corresponding paths through the design state with high coverage. We evaluate SEIF on two open-source CPUs, an AES core, and the AKER access control module. SEIF can be used to find counterexamples to information flow properties, and also to explore all flows originating from a source signal of interest. SEIF accounts for 86–90% of statically identified possible flows in three open-source designs. SEIF’s search strategies enable exploring the designs for 10-12 clock cycles in 4-6 seconds on average, demonstrating that this new exploratory style of information flow analysis can be practical. 
    more » « less
  3. As the number of Internet of Things (IoT) devices proliferate, an in-depth understanding of the IoT attack surface has become quintessential for dealing with the security and reliability risks. IoT devices and components execute implementations of various communication protocols. Vulnerabilities in the protocol stack implementations form an important part of the IoT attack surface. Therefore, finding memory errors in such implementations is essential for improving the IoT security and reliability. This paper presents a tool, SEESAW, that is built on top of a static analysis tool and a symbolic execution engine to achieve scalable analysis of protocol stack implementations. SEESAW leverages the API model of the analyzed code base to perform component-level analysis. SEESAW has been applied to the USB and Bluetooth modules within the Linux kernel. SEESAW can reproduce known memory vulnerabilities in a more scalable way compared to baseline symbolic execution. 
    more » « less
  4. Information leaks are a significant problem in modern software systems. In recent years, information theoretic concepts, such as Shannon entropy, have been applied to quantifying information leaks in programs. One recent approach is to use symbolic execution together with model counting constraints solvers in order to quantify information leakage. There are at least two reasons for unsoundness in quantifying information leakage using this approach: 1) Symbolic execution may not be able to explore all execution paths, 2) Model counting constraints solvers may not be able to provide an exact count. We present a sound symbolic quantitative information flow analysis that bounds the information leakage both for the cases where the program behavior is not fully explored and the model counting constraint solver is unable to provide a precise model count but provides an upper and a lower bound. We implemented our approach as an extension to KLEE for computing sound bounds for information leakage in C programs. 
    more » « less
  5. Machine learning at the extreme edge has enabled a plethora of intelligent, time-critical, and remote applications. However, deploying interpretable artificial intelligence systems that can perform high-level symbolic reasoning and satisfy the underlying system rules and physics within the tight platform resource constraints is challenging. In this paper, we introduceTinyNS, the first platform-aware neurosymbolic architecture search framework for joint optimization of symbolic and neural operators.TinyNSprovides recipes and parsers to automatically write microcontroller code for five types of neurosymbolic models, combining the context awareness and integrity of symbolic techniques with the robustness and performance of machine learning models.TinyNSuses a fast, gradient-free, black-box Bayesian optimizer over discontinuous, conditional, numeric, and categorical search spaces to find the best synergy of symbolic code and neural networks within the hardware resource budget. To guarantee deployability,TinyNStalks to the target hardware during the optimization process. We showcase the utility ofTinyNSby deploying microcontroller-class neurosymbolic models through several case studies. In all use cases,TinyNSoutperforms purely neural or purely symbolic approaches while guaranteeing execution on real hardware. 
    more » « less