Man-at-the-end (MATE) attacks against software programs are difficult to protect. Adversaries have complete access to the binary program and can run it under both static and dynamic analysis to find and break any software protection mechanisms put in place. Even though full-proof protection is not possible practically or theoretically, the goal of software protection should be to make it more difficult for an adversary to find program secrets by increasing either their monetary cost or time. Protection mechanisms must be easy to integrate into the software development lifecycle, or else they are of little to no use. In this paper, we evaluate the practical security of a watermarking technique known as Weaver, which is intended to support software watermarking based on a new transformation technique called executable steganography. Weaver allows hiding of identification marks directly into a program binary in a way that makes it difficult for an adversary to find and remove. We performed instruction frequency analysis on 106 programs from the GNU coreutils package to understand and define Weaver’s limitations and strengths as a watermarking technique. Our evaluation revealed that the initial prototype version of Weaver suffers from limitations in terms of standard benchmarks for steganography evaluation, such as its stealth. We found that this initial prototype of Weaver relied heavily on one type of instruction that does not frequently occur in standard programs, namely the mov instruction with an 8-byte immediate operand. Our instruction frequency analysis revealed a negative impact due to Weaver’s over-reliance on this mov instruction.
more »
« less
Evaluating Security of Executable Steganography for Digital Software Watermarking
Man-at-the-end (MATE) attacks against software programs are difficult to protect. Adversaries have complete access to the binary program and can run it under both static and dynamic analysis to find and break any software protection mechanisms put in place. Even though full-proof protection is not possible practically or theoretically, the goal of software protection should be to make it more difficult for an adversary to find program secrets by increasing either their monetary cost or time. Protection mechanisms must be easy to integrate into the software development lifecycle, or else they are of little to no use. In this paper, we evaluate the practical security of a watermarking technique known as Weaver, which is intended to support software watermarking based on a new transformation technique called executable steganography. Weaver allows hiding of identification marks directly into a program binary in a way that makes it difficult for an adversary to find and remove. We performed instruction frequency analysis on 106 programs from the GNU coreutils package to understand and define Weaver’s limitations and strengths as a watermarking technique. Our evaluation revealed that the initial prototype version of Weaver suffers from limitations in terms of standard benchmarks for steganography evaluation, such as its stealth. We found that this initial prototype of Weaver relied heavily on one type of instruction that does not frequently occur in standard programs, namely the mov instruction with an 8-byte immediate operand. Our instruction frequency analysis revealed a negative impact due to Weaver’s over-reliance on this mov instruction.
more »
« less
- Award ID(s):
- 1811560
- PAR ID:
- 10176504
- Date Published:
- Journal Name:
- 7th Annual Cybersecurity Symposium
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Software watermarking schemes allow a user to embed an identifier into a piece of code such that the resulting program is nearly functionally-equivalent to the original program, and yet, it is difficult to remove the identifier without destroying the functionality of the program. Such schemes are often considered for proving software ownership or for digital rights management. Existing constructions of watermarking have focused primarily on watermarking pseudorandom functions (PRFs). In this work, we revisit the definitional foundations of watermarking, and begin by highlighting a major flaw in existing security notions. Existing security notions for watermarking only require that the identifier be successfully extracted from programs that preserve the exact input/output behavior of the original program. In the context of PRFs, this means that an adversary that constructs a program which computes a quarter of the output bits of the PRF or that is able to distinguish the outputs of the PRF from random are considered to be outside the threat model. However, in any application (e.g., watermarking a decryption device or an authentication token) that relies on PRF security, an adversary that manages to predict a quarter of the bits or distinguishes the PRF outputs from random would be considered to have defeated the scheme. Thus, existing watermarking schemes provide very little security guarantee against realistic adversaries. None of the existing constructions of watermarkable PRFs would be able to extract the identifier from a program that only outputs a quarter of the bits of the PRF or one that perfectly distinguishes. To address the shortcomings in existing watermarkable PRF definitions, we introduce a new primitive called a traceable PRF. Our definitions are inspired by similar definitions from public-key traitor tracing, and aim to capture a very robust set of adversaries: namely, any adversary that produces a useful distinguisher (i.e., a program that can break PRF security), can be traced to a specific identifier. We provide a general framework for constructing traceable PRFs via an intermediate primitive called private linear constrained PRFs. Finally, we show how to construct traceable PRFs from a similar set of assumptions previously used to realize software watermarking. Namely, we obtain a single-key traceable PRF from standard lattice assumptions and a fully collusion-resistant traceable PRF from indistinguishability obfuscation (together with injective one-way functions).more » « less
-
Perumalla, Kalyan; Lopez Jr., Juan; Siraj, Ambareen (Ed.)Executable steganography, the hiding of software machine code inside of a larger program, is a potential approach to introduce new software protection constructs such as watermarks or fingerprints. Software fingerprinting is, therefore, a process similar to steganography, hiding data within other data. The goal of fingerprinting is to hide a unique secret message, such as a serial number, into copies of an executable program in order to provide proof of ownership of that program. Fingerprints are a special case of watermarks, with the difference being that each fingerprint is unique to each copy of a program. Traditionally, researchers describe four aims that a software fingerprint should achieve. These include the fingerprint should be difficult to remove, it should not be obvious, it should have a low false positive rate, and it should have negligible impact on performance. In this research, we propose to extend these objectives and introduce a fifth aim: that software fingerprints should be machine independent. As a result, the same fingerprinting method can be used regardless of the architecture used to execute the program. Hence, this paper presents an approach towards the realization of machine-independent fingerprinting of executable programs. We make use of Low-Level Virtual Machine (LLVM) intermediate representation during the software compilation process to demonstrate both a simple static fingerprinting method as well as a dynamic method, which displays our aim of hardware independent fingerprinting. The research contribution includes a realization of the approach using the LLVM infrastructure and provides a proof of concept for both simple static and dynamic watermarks that are architecture neutral.more » « less
-
Malware authors make use of several techniques to obfuscate code from reverse engineering tools such as IdaPro. Typically, these techniques tend to be effective for about three to six instructions, but eventually the tools can properly disassemble the remaining code once the tool is again synchronized with the operation codes. But this loss of synchronization can be used to hide information within the instructions – steganography. Our research explores an approach to this by presenting “Weaver”, a framework for executable steganography. “Weaver” differs from other techniques in how it hides malicious instructions: the hiding instructions are prepared by generating an assembly listing of the program and finding candidate hiding locations, the steganography instructions are prepared by creating an assembly listing of the program to obtain the operation codes to be hidden, and the “weaving” process merges the two. This “weaving” attempts to place all the steganography instructions into candidate locations found in the hiding instructions.more » « less
-
Obfuscation is an important technique to protect software from adversary analysis. Control flow obfuscation effectively prevents attackers from understanding the program structure, hence impeding a broad set of reverse engineering efforts. In this paper, we propose a novel control flow obfuscation method which employs Turing machines to simulate the computation of branch conditions. By weaving the orig- inal program with Turing machine components, program control flow graph and call graph can become much more complicated. In addition, due to the runtime computation complexity of a Turing machine, pro- gram execution flow would be highly obfuscated and become resilient to advanced reverse engineering approaches via symbolic execution and concolic testing. We have implemented a prototype tool for Turing obfuscation. Compar- ing with previous work, our control flow obfuscation technique delivers three distinct advantages. 1). Complexity: the complicated structure of a Turing machine makes it difficult for attackers to understand the program control flow. 2). Universality: Turing machines can encode any compu- tation and hence applicable to obfuscate any program component. 3). Resiliency: Turing machine brings in complex execution model, which is shown to withstand automated reverse engineering efforts. Our evalua- tion obfuscates control flow predicates of two widely-used applications, and the experimental results show that the proposed technique can ob- fuscate programs in stealth with good performance and robustness.more » « less
An official website of the United States government

