skip to main content

Title: HALucinator: Firmware Re-hosting Through Abstraction Layer Emulation
Given the increasing ubiquity of online embedded devices, analyzing their firmware is important to security, privacy, and safety. The tight coupling between hardware and firmware and the diversity found in embedded systems makes it hard to perform dynamic analysis on firmware. However, firmware developers regularly develop code using abstractions, such as Hardware Abstraction Layers (HALs), to simplify their job. We leverage such abstractions as the basis for the re-hosting and analysis of firmware. By providing high-level replacements for HAL functions (a process termed High-Level Emulation – HLE), we decouple the hardware from the firmware. This approach works by first locating the library functions in a firmware sample, through binary analysis, and then providing generic implementations of these functions in a full-system emulator. We present these ideas in a prototype system, HALucinator, able to re-host firmware, and allow the virtual device to be used normally. First, we introduce extensions to existing library matching techniques that are needed to identify library functions in binary firmware, to reduce collisions, and for inferring additional function names. Next, we demonstrate the re-hosting process, through the use of simplified handlers and peripheral models, which make the process fast, flexible, and portable between firmware samples and chip vendors. Finally, we demonstrate the practicality of HLE for security analysis, by supplementing HALucinator with the American Fuzzy Lop fuzzer, to locate multiple previously-unknown vulnerabilities in firmware middleware libraries.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 29th USENIX Security Symposium
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The recent paradigm shift introduced by the Internet of Things (IoT) has brought embedded systems into focus as a target for both security analysts and malicious adversaries. Typified by their lack of standardized hardware, diverse software, and opaque functionality, IoT devices present unique challenges to security analysts due to the tight coupling between their firmware and the hardware for which it was designed. In order to take advantage of modern program analysis techniques, such as fuzzing or symbolic execution, with any kind of scale or depth, analysts must have the ability to execute firmware code in emulated (or virtualized) environments. However, these emulation environments are rarely available and cumbersome to create through manual reverse engineering, greatly limiting the analysis of binary firmware. In this work, we explore the problem of firmware re-hosting, the process by which firmware is migrated from its original hardware environment into a virtualized one. We show that an approach capable of creating virtual, interactive environments in an automated manner is a necessity to enable firmware analysis at scale. We present the first proof-of-concept system aiming to achieve this goal, called PRETENDER, which uses observations of the interactions between the original hardware and the firmware to automatically create models of peripherals, and allows for the execution of the firmware in a fully-emulated environment. Unlike previous approaches, these models are interactive, stateful, and transferable, meaning they are designed to allow the program to receive and process new input, a requirement of many analyses. We demonstrate our approach on multiple hardware platforms and firmware samples, and show that the models are flexible enough to allow for virtualized code execution, the exploration of new code paths, and the identification of security vulnerabilities. 
    more » « less
  2. Human analysts must reverse engineer binary programs as a prerequisite for a number of security tasks, such as vulnerability analysis, malware detection, and firmware re-hosting. Existing studies of human reversers and the processes they follow are limited in size and often use qualitative metrics that require subjective evaluation. In this paper, we reframe the problem of reverse engineering binaries as the problem of perfect decompilation, which is the process of recovering, from a binary program, source code that, when compiled, produces binary code that is identical to the original binary. This gives us a quantitative measure of understanding, and lets us examine the reversing process programmatically. We developed a tool, called Decomperson, that supported a group of reverse engineers during a large-scale security competition designed to collect information about the participants' reverse engineering process, with the well-defined goal of achieving perfect decompilation. Over 150 people participated, and we collected more than 35,000 code submissions, the largest manual reverse engineering dataset to date. This includes snapshots of over 300 successful perfect decompilation attempts. In this paper, we show how perfect decompilation allows programmatic analysis of such large datasets, providing new insights into the reverse engineering process. 
    more » « less
  3. The massive trend toward embedded systems introduces new security threats to prevent. Malicious firmware makes it easier to launch cyberattacks against embedded systems. Systems infected with malicious firmware maintain the appearance of normal firmware operations but execute undesirable activities, which is usually a security risk. Traditionally, cybercriminals use malicious firmware to develop possible back-doors for future attacks. Due to the restricted resources of embedded systems, it is difficult to thwart these attacks using the majority of contemporary standard security protocols. In addition, monitoring the firmware operations using existing side channels from outside the processing unit, such as electromagnetic radiation, necessitates a complicated hardware configuration and in-depth technical understanding. In this paper, we propose a physical side channel that is formed by detecting the overall impedance changes induced by the firmware actions of a central processing unit. To demonstrate how this side channel can be exploited for detecting firmware activities, we experimentally validate it using impedance measurements to distinguish between distinct firmware operations with an accuracy of greater than 90%. These findings are the product of classifiers that are trained via machine learning. The implementation of our proposed methodology also leaves room for the use of hardware authentication. 
    more » « less
  4. Functional hardware description languages (FHDL) provide powerful tools for building new abstractions that enable sophisticated hardware system to be constructed by composing small reusable parts. Raising the level of abstractions in hardware designs means the programmer can focus on high-level circuit structure rather than mundane low-level details. The language features that facilitate this include high-order functions, rich static type system with type inference, and parametric polymorphism. We use hand-written structural and behavioral VHDL, Simulink, and the Kansas Lava FHDL to re-implement several components taken from a Simulink model of an orthogonal frequency-division multiplexing (OFDM) physical layer (PHY). Our development demonstrates that an FHDL can require fewer lines of code than traditional design languages without sacrificing performance. 
    more » « less
  5. null (Ed.)
    Closely monitoring the behavior of a software system during its execution enables developers and analysts to observe, and ultimately understand, how it works. This kind of dynamic analysis can be instrumental to reverse engineering, vulnerability discovery, exploit development, and debugging. While these analyses are typically well-supported for homogeneous desktop platforms (e.g., x86 desktop PCs), they can rarely be applied in the heterogeneous world of embedded systems. One approach to enable dynamic analyses of embedded systems is to move software stacks from physical systems into virtual environments that sufficiently model hardware behavior. This process which we call “rehosting” poses a significant research challenge with major implications for security analyses. Although rehosting has traditionally been an unscientific and ad-hoc endeavor undertaken by domain experts with varying time and resources at their disposal, researchers are beginning to address rehosting challenges systematically and in earnest. In this paper, we establish that emulation is insufficient to conduct large-scale dynamic analysis of real-world hardware systems and present rehosting as a firmware-centric alternative. Furthermore, we taxonomize preliminary rehost- ing efforts, identify the fundamental components of the rehosting process, and propose directions for future research. 
    more » « less