skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Cryptographic Function Detection in Obfuscated Binaries via Bit-precise Symbolic Loop Mapping,
Cryptographic functions have been commonly abused by malware developers to hide malicious behaviors, disguise destructive payloads, and bypass network-based fire- walls. Now-infamous crypto-ransomware even encrypts victim’s computer documents until a ransom is paid. Therefore, de- tecting cryptographic functions in binary code is an appealing approach to complement existing malware defense and forensics. However, pervasive control and data obfuscation schemes make cryptographic function identification a challenging work. Existing detection methods are either brittle to work on obfuscated binaries or ad hoc in that they can only identify specific cryp- tographic functions. In this paper, we propose a novel technique called bit-precise symbolic loop mapping to identify cryptographic functions in obfuscated binary code. Our trace-based approach captures the semantics of possible cryptographic algorithms with bit-precise symbolic execution in a loop. Then we perform guided fuzzing to efficiently match boolean formulas with known reference implementations. We have developed a prototype called CryptoHunt and evaluated it with a set of obfuscated synthetic examples, well-known cryptographic libraries, and malware. Compared with the existing tools, CryptoHunt is a general approach to detecting commonly used cryptographic functions such as TEA, AES, RC4, MD5, and RSA under different control and data obfuscation scheme combinations.  more » « less
Award ID(s):
1320605
PAR ID:
10066918
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 38th IEEE Symposium on Security and Privacy
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The key challenge of software reverse engi- neering is that the source code of the program under in- vestigation is typically not available. Identifying differ- ences between two executable binaries (binary diffing) can reveal valuable information in the absence of source code, such as vulnerability patches, software plagiarism evidence, and malware variant relations. Recently, a new binary diffing method based on symbolic execution and constraint solving has been proposed to look for the code pairs with the same semantics, even though they are ostensibly different in syntactics. Such semantics- based method captures intrinsic differences/similarities of binary code, making it a compelling choice to analyze highly-obfuscated malicious programs. However, due to the nature of symbolic execution, semantics-based bi- nary diffing suffers from significant performance slow- down, hindering it from analyzing large numbers of malware samples. In this paper, we attempt to miti- gate the high overhead of semantics-based binary diff- ing with application to malware lineage inference. We first study the key obstacles that contribute to the performance bottleneck. Then we propose normalized basic block memoization to speed up semantics-based binary diffing. We introduce an unionfind set structure that records semantically equivalent basic blocks. Managing the union-find structure during successive comparisons allows direct reuse of previously computed results. Moreover, we utilize a set of enhanced optimization methods to further cut down the invocation numbers of constraint solver. We have implemented our tech- nique, called MalwareHunt, on top of a trace-oriented binary diffing tool and evaluated it on 15 polymorphic and metamorphic malware families. We perform intra- family comparisons for the purpose of malware lineage inference. Our experimental results show that Malware- Huntcan accelerate symbolic execution from 2.8X to 5.3X (with an average 4.1X), and reduce constraint solver invocation by a factor of 3.0X to 6.0X (with an average 4.5X). 
    more » « less
  2. Android applications are usually obfuscated before release, making it difficult to analyze them for malware presence or intellectual property violations. Obfuscators might hide the true intent of code by renaming variables and/or modifying program structures. It is challenging to search for executables relevant to an obfuscated application for developers to analyze efficiently. Prior approaches toward obfuscation resilient search have relied on certain structural parts of apps remaining as landmarks, un-touched by obfuscation. For instance, some prior approaches have assumed that the structural relationships between identifiers are not broken by obfuscators; others have assumed that control flow graphs maintain their structures. Both approaches can be easily defeated by a motivated obfuscator. We present a new approach, MACNETO, to search for programs relevant to obfuscated executables leveraging deep learning and principal components on instructions. MACNETO makes few assumptions about the kinds of modifications that an obfuscator might perform. We show that it has high search precision for executables obfuscated by a state-of-the-art obfuscator that changes control flow. Further, we also demonstrate the potential of MACNETO to help developers understand executables, where MACNETO infers keywords (which are from relevant un-obfuscated programs) for obfuscated executables. 
    more » « less
  3. Malware written in dynamic languages such as PHP routinely employ anti-analysis techniques such as obfuscation schemes and evasive tricks to avoid detection. On top of that, attackers use automated malware creation tools to create numerous variants with little to no manual effort. This paper presents a system called Cubismo to solve this pressing problem. It processes potentially malicious files and decloaks their obfuscations, exposing the hidden malicious code into multiple files. The resulting files can be scanned by existing malware detection tools, leading to a much higher chance of detection. Cubismo achieves improved detection by exploring all executable statements of a suspect program counterfactually to see through complicated polymorphism, metamorphism and, obfuscation techniques and expose any malware. Our evaluation on a real-world data set collected from a commercial web hosting company shows that Cubismo is highly effective in dissecting sophisticated metamorphic malware with multiple layers of obfuscation. In particular, it enables VirusTotal to detect 53 out of 56 zero-day malware samples in the wild, which were previously undetectable. 
    more » « less
  4. We explore a new pathway to designing unclonable cryptographic primitives. We propose a new notion called unclonable puncturable obfuscation (UPO) and study its implications for unclonable cryptography. Using UPO, we present modular (and in some cases, arguably, simple) constructions of many primitives in unclonable cryptography, including, public-key quantum money, quantum copy-protection for many classes of functionalities, unclonable encryption, and single-decryption encryption. Notably, we obtain the following new results assuming the existence of UPO: We show that any cryptographic functionality can be copy-protected as long as it satisfies a notion of security, which we term puncturable security. Prior feasibility results focused on copy-protecting specific cryptographic functionalities. We show that copy-protection exists for any class of evasive functions as long as the associated distribution satisfies a preimage-sampleability condition. Prior works demonstrated copy-protection for point functions, which follows as a special case of our result. We put forward two constructions of UPO. The first construction satisfies two notions of security based on the existence of (post-quantum) sub-exponentially secure indistinguishability obfuscation, injective one-way functions, the quantum hardness of learning with errors, and the two versions of a new conjecture called the simultaneous inner product conjecture. The security of the second construction is based on the existence of unclonable-indistinguishable bit encryption, injective one-way functions, and quantum-state indistinguishability obfuscation. 
    more » « less
  5. Design-for-test/debug (DfT/D) introduces scan chain testing to increase testability and fault coverage by inserting scan flip-flops. However, these scan chains are also known to be a liability for security primitives. In previous research, the dynamically obfuscated scan chain (DOSC) was introduced to protect logic-locking keys from scan-based attacks by obscuring test patterns and responses. In this paper, we present DOSCrack, an oracle-guided attack to de-obfuscate DOSC using symbolic execution and binary clustering, which significantly reduces the candidate seed space to a manageable quantity. Our symbolic execution engine employs scan mode simulation and satisfiability modulo theories (SMT) solvers to reduce the possible seed space, while obfuscation key clustering allows us to effectively rule out a group of seeds that share similarities. An integral component of our approach is the use of sequential equivalence checking (SEC), which aids in identifying distinct simulation patterns to differentiate between potential obfuscation keys. We experimentally applied our DOSCrack framework on four different sizes of DOSC benchmarks and compared their runtime and complexity. Finally, we propose a low-cost countermeasure to DOSCrack which incorporates a nonlinear feedback shift register (NLFSR) to increase the effort of symbolic execution modeling and serves as an effective defense against our DOSCrack framework. Our research effectively addresses a critical vulnerability in scan-chain obfuscation methodologies, offering insights into DfT/D and logic locking for both academic research and industrial applications. Our framework highlights the need to craft robust and adaptable defense mechanisms to counter evolving scan-based attacks. 
    more » « less