skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Lambda Obfuscation
With the rise of increasingly advanced reverse engineering technique, especially more scalable symbolic execution tools, software obfuscation faces great challenges. Branch conditions contain important control flow logic of a program. Adversaries can use powerful program analysis tools to collect sensitive program properties and recover a pro- gram’s internal logic, stealing intellectual properties from the original owner. In this paper, we propose a novel control obfuscation technique that uses lambda calculus to hide the original computation semantics and makes the original program more obscure to understand and re- verse engineer. Our obfuscator replaces the conditional instructions with lambda calculus function calls that simulate the same behavior with a more complicated execution model. Our experiment result shows that our obfuscation method can protect sensitive branch conditions from state- of-the-art symbolic execution techniques, with only modest overhead.  more » « less
Award ID(s):
1320605
PAR ID:
10066915
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 13th EAI International Conference on Security and Privacy in Communication Networks (SecureComm 2017)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Obfuscation is an important technique to protect software from adversary analysis. Control flow obfuscation effectively prevents attackers from understanding the program structure, hence impeding a broad set of reverse engineering efforts. In this paper, we propose a novel control flow obfuscation method which employs Turing machines to simulate the computation of branch conditions. By weaving the orig- inal program with Turing machine components, program control flow graph and call graph can become much more complicated. In addition, due to the runtime computation complexity of a Turing machine, pro- gram execution flow would be highly obfuscated and become resilient to advanced reverse engineering approaches via symbolic execution and concolic testing. We have implemented a prototype tool for Turing obfuscation. Compar- ing with previous work, our control flow obfuscation technique delivers three distinct advantages. 1). Complexity: the complicated structure of a Turing machine makes it difficult for attackers to understand the program control flow. 2). Universality: Turing machines can encode any compu- tation and hence applicable to obfuscate any program component. 3). Resiliency: Turing machine brings in complex execution model, which is shown to withstand automated reverse engineering efforts. Our evalua- tion obfuscates control flow predicates of two widely-used applications, and the experimental results show that the proposed technique can ob- fuscate programs in stealth with good performance and robustness. 
    more » « less
  2. Symbolic execution is an automated test input generation technique that models individual program paths as logical constraints. However, the realism of concrete test inputs generated by SMT solvers often comes into question. Existing symbolic execution tools only seek arbitrary solutions for given path constraints. These constraints do not incorporate the naturalness of inputs that observe statistical distributions, range constraints, or preferred string constants. This results in unnatural-looking inputs that fail to emulate real-world data. In this paper, we extend symbolic execution with consideration for incorporating naturalness. Our key insight is that users typically understand the semantics of program inputs, such as the distribution of height or possible values of zipcode, which can be leveraged to advance the ability of symbolic execution to produce natural test inputs. We instantiate this idea in NaturalSym, a symbolic execution-based test generation tool for data-intensive scalable computing (DISC) applications. NaturalSym generates natural-looking data that mimics real-world distributions by utilizing user-provided input semantics to drastically enhance the naturalness of inputs, while preserving strong bug-finding potential. On DISC applications and commercial big data test benchmarks, NaturalSym achieves a higher degree of realism —as evidenced by a perplexity score 35.1 points lower on median, and detects 1.29× injected faults compared to the state-of-the-art symbolic executor for DISC, BigTest. This is because BigTest draws inputs purely based on the satisfiability of path constraints constructed from branch predicates, while NaturalSym is able to draw natural concrete values based on user-specified semantics and prioritize using these values in input generation. Our empirical results demonstrate that NaturalSym finds injected faults 47.8× more than NaturalFuzz (a coverage-guided fuzzer) and 19.1× more than ChatGPT. Meanwhile, TestMiner (a mining-based approach) fails to detect any injected faults. NaturalSym is the first symbolic executor that combines the notion of input naturalness in symbolic path constraints during SMT-based input generation. We make our code available at https://github.com/UCLA-SEAL/NaturalSym. 
    more » « less
  3. Nadel, Alexander; Rozier, Kristin Yvonne (Ed.)
    Symbolic execution is a powerful verification tool for hardware designs, in particular for security validation. However, symbolic execution suffers from the path explosion problem in which the number of paths to explore grows exponentially with the number of branches in the design. We introduce a new approach, piecewise composition, which leverages the modular structure of hardware to transfer the work of path exploration to SMT solvers. Piecewise composition works by recognizing that independent parts of a design can each be explored once, and the exploration reused. A hardware design with N independent always blocks and at most b branch points per block will require exploration of O((2^b)N) paths in a single clock cycle with our approach compared to O(2^(bN)) paths using traditional symbolic execution. We present Sylvia, a symbolic execution engine implementing piecewise composition. The engine operates directly over RTL without requiring translation to a netlist or software simulation. We evaluate our tool on multiple open-source SoC and CPU designs, including the OR1200 and PULPissimo RISC-V SoC. The piecewise composition technique reduces the number of paths explored by an order of magnitude and reduces the runtime by 97% compared to our baseline. Using 84 properties from the security literature we find assertion violations in open-source designs that traditional model checking and formal verification tools do not find. 
    more » « less
  4. Design-for-test/debug (DfT/D) introduces scan chain testing to increase testability and fault coverage by inserting scan flip-flops. However, these scan chains are also known to be a liability for security primitives. In previous research, the dynamically obfuscated scan chain (DOSC) was introduced to protect logic-locking keys from scan-based attacks by obscuring test patterns and responses. In this paper, we present DOSCrack, an oracle-guided attack to de-obfuscate DOSC using symbolic execution and binary clustering, which significantly reduces the candidate seed space to a manageable quantity. Our symbolic execution engine employs scan mode simulation and satisfiability modulo theories (SMT) solvers to reduce the possible seed space, while obfuscation key clustering allows us to effectively rule out a group of seeds that share similarities. An integral component of our approach is the use of sequential equivalence checking (SEC), which aids in identifying distinct simulation patterns to differentiate between potential obfuscation keys. We experimentally applied our DOSCrack framework on four different sizes of DOSC benchmarks and compared their runtime and complexity. Finally, we propose a low-cost countermeasure to DOSCrack which incorporates a nonlinear feedback shift register (NLFSR) to increase the effort of symbolic execution modeling and serves as an effective defense against our DOSCrack framework. Our research effectively addresses a critical vulnerability in scan-chain obfuscation methodologies, offering insights into DfT/D and logic locking for both academic research and industrial applications. Our framework highlights the need to craft robust and adaptable defense mechanisms to counter evolving scan-based attacks. 
    more » « less
  5. Cryptographic functions have been commonly abused by malware developers to hide malicious behaviors, disguise destructive payloads, and bypass network-based fire- walls. Now-infamous crypto-ransomware even encrypts victim’s computer documents until a ransom is paid. Therefore, de- tecting cryptographic functions in binary code is an appealing approach to complement existing malware defense and forensics. However, pervasive control and data obfuscation schemes make cryptographic function identification a challenging work. Existing detection methods are either brittle to work on obfuscated binaries or ad hoc in that they can only identify specific cryp- tographic functions. In this paper, we propose a novel technique called bit-precise symbolic loop mapping to identify cryptographic functions in obfuscated binary code. Our trace-based approach captures the semantics of possible cryptographic algorithms with bit-precise symbolic execution in a loop. Then we perform guided fuzzing to efficiently match boolean formulas with known reference implementations. We have developed a prototype called CryptoHunt and evaluated it with a set of obfuscated synthetic examples, well-known cryptographic libraries, and malware. Compared with the existing tools, CryptoHunt is a general approach to detecting commonly used cryptographic functions such as TEA, AES, RC4, MD5, and RSA under different control and data obfuscation scheme combinations. 
    more » « less