skip to main content


Title: Racing in Hyperspace: Closing Hyper-Threading Side Channels on SGX with Contrived Data Races
In this paper, we present HYPERRACE, an LLVM-based tool for instrumenting SGX enclave programs to eradicate all side-channel threats due to Hyper-Threading. HYPERRACE creates a shadow thread for each enclave thread and asks the underlying untrusted operating system to schedule both threads on the same physical core whenever enclave code is invoked, so that Hyper-Threading side channels are closed completely. Without placing additional trust in the operating system’s CPU scheduler, HYPERRACE conducts a physical-core co-location test: it first constructs a communication channel between the threads using a shared variable inside the enclave and then measures the communication speed to verify that the communication indeed takes place in the shared L1 data cache—a strong indicator of physical-core co-location. The key novelty of the work is the measurement of communication speed without a trustworthy clock; instead, relative time measurements are taken via contrived data races on the shared variable. It is worth noting that the emphasis of HYPERRACE’s defense against Hyper-Threading side channels is because they are open research problems. In fact, HYPERRACE also detects the occurrence of exception- or interrupt-based side channels, the solutions of which have been studied by several prior works.  more » « less
Award ID(s):
1750809 1718084
NSF-PAR ID:
10063362
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings - IEEE Symposium on Security and Privacy
ISSN:
1081-6011
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Speculative-execution attacks, such as SgxSpectre, Foreshadow, and MDS attacks, leverage recently disclosed CPU hardware vulnerabilities and micro-architectural side channels to breach the confidentiality and integrity of Intel Software Guard eXtensions (SGX). Unlike traditional micro-architectural side-channel attacks, speculative-execution attacks extract any data in the enclave memory, which makes them very challenging to defeat purely from the software. However, to date, Intel has not completely mitigated the threats of speculative-execution attacks from the hardware. Hence, future attack variants may emerge. This paper proposes a software-based solution to speculative-execution attacks, even with the strong assumption that confidentiality of enclave memory is compromised. Our solution extends an existing work called HyperRace, which is a compiler-assisted tool for detecting Hyper-Threading based side-channel attacks against SGX enclaves, to thwart speculative-execution attacks from within SGX enclaves. It requires supports from the untrusted operating system, e.g., for temporarily disabling interrupts, but verifies the OS's behaviors. Additional microcode upgrades are required from Intel to secure the attestation flow. 
    more » « less
  2. null (Ed.)
    Recent work in networking, storage and multi-threading has demonstrated improved performance and scalability by replacing kernel-mode interrupts with high-rate user-space polling. Typically, such polling is performed by a dedicated core. Compiler Interrupts (CIs) instead enable efficient, automatic high-rate polling on a shared thread, which performs other work between polls. CIs are instrumentation-based and light-weight, allowing frequent interrupts with little performance impact. For example, when targeting a 5,000 cycle interval, the median overhead of our fastest CI design is 4% vs. 800% for hardware interrupts, across programs in the SPLASH-2, Phoenix and Parsec benchmark suites running with 32 threads. We evaluate CIs on three systems-level applications: (a) kernel bypass networking with mTCP, (b) joint kernel bypass networking and CPU scheduling with Shenango, and (c) delegation, a message-passing alternative to locking, with FFWD. For each application, we find that CIs offer compelling qualitative and quantitative improvements over the current state of the art. For example, CI-based mTCP achieves ≈2× stock mTCP throughput on a sample HTTP application. 
    more » « less
  3. Current trends in desktop processor design have been toward many-core solutions with increased parallelism. As the number of supported threads grows in these processors, it may prove difficult to exploit them on the commodity desktop. This paper presents a study that explores the spawning of the dynamic memory management activities into a separately executing thread that runs concurrently with the main program thread. Our approach works without requiring modifications to the original source program by redefining the dynamic link path to capture malloc and free calls in a threading dynamic memory management library. The routines of this library are setup so that the initial call to malloc triggers the creation of a thread for dynamic memory management; successive calls to malloc and free will trigger coordination with this thread for dynamic memory management activities. Our preliminary studies show that we can transparently redefine the dynamic memory management activities and we have successfully done so for numerous test programs including most of the SPEC CPU2006 benchmarks, Firefox, and other unix utilities. The results of our experiments show that it is possible to achieve 2-3% performance gains in the three most memory-intensive SPEC CPU2006 benchmarks without requiring recompilation of the benchmark source code. We were also able to achieve a 3-4% speedup when using our library with the gcc and llvm compilers. 
    more » « less
  4. A protocol for two-party secure function evaluation (2P-SFE) aims to allow the parties to learn the output of function f of their private inputs, while leaking nothing more. In a sense, such a protocol realizes a trusted oracle that computes f and returns the result to both parties. There have been tremendous strides in efficiency over the past ten years, yet 2P-SFE protocols remain impractical for most real-time, online computations, particularly on modestly provisioned devices. Intel's Software Guard Extensions (SGX) provides hardware-protected execution environments, called enclaves, that may be viewed as trusted computation oracles. While SGX provides native CPU speed for secure computation, previous side-channel and micro-architecture attacks have demonstrated how security guarantees of enclaves can be compromised. In this paper, we explore a balanced approach to 2P-SFE on SGX-enabled processors by constructing a protocol for evaluating f relative to a partitioning of f. This approach alleviates the burden of trust on the enclave by allowing the protocol designer to choose which components should be evaluated within the enclave, and which via standard cryptographic techniques. We describe SGX-enabled SFE protocols (modeling the enclave as an oracle), and formalize the strongest-possible notion of 2P-SFE for our setting. We prove our protocol meets this notion when properly realized. We implement the protocol and apply it to two practical problems: privacy-preserving queries to a database, and a version of Dijkstra's algorithm for privacy-preserving navigation. Our evaluation shows that our SGX-enabled SFE scheme enjoys a 38x increase in performance over garbled-circuit-based SFE. Finally, we justify modeling of the enclave as an oracle by implementing protections against known side-channels. 
    more » « less
  5. The Emu Chick is a prototype system designed around the concept of migratory memory-side processing. Rather than transferring large amounts of data across power-hungry, high-latency interconnects, the Emu Chick moves lightweight thread contexts to near-memory cores before the beginning of each memory read. The current prototype hardware uses FPGAs to implement cache-less “Gossamer” cores for doing computational work and a stationary core to run basic operating system functions and migrate threads between nodes. In this initial characterization of the Emu Chick, we study the memory bandwidth characteristics of the system through benchmarks like STREAM, pointer chasing, and sparse matrix vector multiply. We compare the Emu Chick hardware to architectural simulation and Intel Xeon-based platforms. While it is difficult to accurately compare prototype hardware with existing systems, our initial evaluation demonstrates that the Emu Chick uses available memory bandwidth more efficiently than a more traditional, cache-based architecture. Moreover, the Emu Chick provides stable, predictable performance with 80% bandwidth utilization on a random-access pointer chasing benchmark with weak locality. 
    more » « less