skip to main content


Title: Eliminating Micro-Architectural Side-Channel Attacks using Near Memory Processing
Secure architectures are becoming an increasingly common demand. This is due in large part to the rise of cloud computing, as users are trusting their data with hardware that they do not own. Unfortunately, many secure computation and isolation techniques are still susceptible to side-channel attacks. While various defenses to side-channel attacks exist, each tends to be targeted to a specific vulnerability and comes with a high runtime overhead, making it difficult to combine these defenses together in a performant manner.This work proposes an efficient design for preventing a large range of cache side-channel attacks by leveraging a near-memory processing (NMP) architecture. Specifically, the proposed design stores all sensitive data in isolated NMP vaults and performs all computation involving that sensitive data on NMP cores. Our approach eliminates possible cache side-channels while also minimizing runtime overhead when the parallelizability of NMP architecture is leveraged. Simulation results from a cycle-accurate architecture model shows that offloading secure computation to NMP cores can have as little as 0.26% overhead.  more » « less
Award ID(s):
1908806
NSF-PAR ID:
10472545
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED)
Page Range / eLocation ID:
179 to 189
Subject(s) / Keyword(s):
["near memory processing, secure architecture, cache side-channel defense, hardware security"]
Format(s):
Medium: X
Location:
Storrs, CT, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Commodity operating system (OS) kernels, such as Windows, Mac OS X, Linux, and FreeBSD, are susceptible to numerous security vulnerabilities. Their monolithic design gives successful attackers complete access to all application data and system resources. Shielding systems such as InkTag, Haven, and Virtual Ghost protect sensitive application data from compromised OS kernels. However, such systems are still vulnerable to side-channel attacks. Worse yet, compromised OS kernels can leverage their control over privileged hardware state to exacerbate existing side channels; recent work has shown that a compromised OS kernel can steal entire documents via side channels. This paper presents defenses against page table and last-level cache (LLC) side-channel attacks launched by a compromised OS kernel. Our page table defenses restrict the OS kernel’s ability to read and write page table pages and defend against page allocation attacks, and our LLC defenses utilize the Intel Cache Allocation Technology along with memory isolation primitives. We proto- type our solution in a system we call Apparition, building on an optimized version of Virtual Ghost. Our evaluation shows that our side-channel defenses add 1% to 18% (with up to 86% for one application) overhead to the optimized Virtual Ghost (relative to the native kernel) on real-world applications. 
    more » « less
  2. Typical cybersecurity solutions emphasize on achieving defense functionalities. However, execution efficiency and scalability are equally important, especially for real-world deployment. Straightforward mappings of cybersecurity applications onto HPC platforms may significantly underutilize the HPC devices’ capacities. On the other hand, the sophisticated implementations are quite difficult: they require both in-depth understandings of cybersecurity domain-specific characteristics and HPC architecture and system model. In our work, we investigate three sub-areas in cybersecurity, including mobile software security, network security, and system security. They have the following performance issues, respectively: 1) The flow- and context-sensitive static analysis for the large and complex Android APKs are incredibly time-consuming. Existing CPU-only frameworks/tools have to set a timeout threshold to cease the program analysis to trade the precision for performance. 2) Network intrusion detection systems (NIDS) use automata processing as its searching core and requires line-speed processing. However, achieving high-speed automata processing is exceptionally difficult in both algorithm and implementation aspects. 3) It is unclear how the cache configurations impact time-driven cache side-channel attacks’ performance. This question remains open because it is difficult to conduct comparative measurement to study the impacts. In this dissertation, we demonstrate how application-specific characteristics can be leveraged to optimize implementations on various types of HPC for faster and more scalable cybersecurity executions. For example, we present a new GPU-assisted framework and a collection of optimization strategies for fast Android static data-flow analysis that achieve up to 128X speedups against the plain GPU implementation. For network intrusion detection systems (IDS), we design and implement an algorithm capable of eliminating the state explosion in out-of-order packet situations, which reduces up to 400X of the memory overhead. We also present tools for improving the usability of Micron’s Automata Processor. To study the cache configurations’ impact on time-driven cache side-channel attacks’ performance, we design an approach to conducting comparative measurement. We propose a quantifiable success rate metric to measure the performance of time-driven cache attacks and utilize the GEM5 platform to emulate the configurable cache. 
    more » « less
  3. Website fingerprinting attacks, which use statistical analysis on network traffic to compromise user privacy, have been shown to be effective even if the traffic is sent over anonymity-preserving networks such as Tor. The classical attack model used to evaluate website fingerprinting attacks assumes an on-path adversary, who can observe all traffic traveling between the user’s computer and the secure network. In this work we investigate these attacks under a different attack model, in which the adversary is capable of sending a small amount of malicious JavaScript code to the target user’s computer. The malicious code mounts a cache side-channel attack, which exploits the effects of contention on the CPU’s cache, to identify other websites being browsed. The effectiveness of this attack scenario has never been systematically analyzed, especially in the open-world model which assumes that the user is visiting a mix of both sensitive and non-sensitive sites. We show that cache website fingerprinting attacks in JavaScript are highly feasible. Specifically, we use machine learning techniques to classify traces of cache activity. Unlike prior works, which try to identify cache conflicts, our work measures the overall occupancy of the last-level cache. We show that our approach achieves high classification accuracy in both the open-world and the closed-world models. We further show that our attack is more resistant than network-based fingerprinting to the effects of response caching, and that our techniques are resilient both to network-based defenses and to side-channel countermeasures introduced to modern browsers as a response to the Spectre attack. To protect against cache-based website fingerprinting, new defense mechanisms must be introduced to privacy-sensitive browsers and websites. We investigate one such mechanism, and show that generating artificial cache activity reduces the effectiveness of the attack and completely eliminates it when used in the Tor Browser 
    more » « less
  4. With the ever-increasing virtualization of software and hardware, the privacy of user-sensitive data is a fundamental concern in computation outsourcing. Secure processors enable a trusted execution environment to guarantee security properties based on the principles of isolation, sealing, and integrity. However, the shared hardware resources within the microarchitecture are increasingly being used by co-located adversarial software to create timing-based side-channel attacks. State-of-the-art secure processors implement the strong isolation primitive to enable non-interference for shared hardware, but suffer from frequent state purging and resource utilization overheads, leading to degraded performance. This paper proposes ASM , an adaptive secure multicore architecture that enables a reconfigurable, yet strongly isolated execution environment. For outsourced security-critical processes, the proposed security kernel and hardware extensions allow either a given process to execute using all available cores, or co-execute multiple processes on strongly isolated clusters of cores. This spatio-temporal execution environment is configured based on resource demands of processes, such that the secure processor mitigates state purging overheads and maximizes hardware resource utilization. 
    more » « less
  5. null (Ed.)
    Memory hard functions (MHFs) are an important cryptographic primitive that are used to design egalitarian proofs of work and in the construction of moderately expensive key-derivation functions resistant to brute-force attacks. Broadly speaking, MHFs can be divided into two categories: data-dependent memory hard functions (dMHFs) and data-independent memory hard functions (iMHFs). iMHFs are resistant to certain side-channel attacks as the memory access pattern induced by the honest evaluation algorithm is independent of the potentially sensitive input e.g., password. While dMHFs are potentially vulnerable to side-channel attacks (the induced memory access pattern might leak useful information to a brute-force attacker), they can achieve higher cumulative memory complexity (CMC) in comparison than an iMHF. In particular, any iMHF that can be evaluated in N steps on a sequential machine has CMC at most 𝒪((N^2 log log N)/log N). By contrast, the dMHF scrypt achieves maximal CMC Ω(N^2) - though the CMC of scrypt would be reduced to just 𝒪(N) after a side-channel attack. In this paper, we introduce the notion of computationally data-independent memory hard functions (ciMHFs). Intuitively, we require that memory access pattern induced by the (randomized) ciMHF evaluation algorithm appears to be independent from the standpoint of a computationally bounded eavesdropping attacker - even if the attacker selects the initial input. We then ask whether it is possible to circumvent known upper bound for iMHFs and build a ciMHF with CMC Ω(N^2). Surprisingly, we answer the question in the affirmative when the ciMHF evaluation algorithm is executed on a two-tiered memory architecture (RAM/Cache). We introduce the notion of a k-restricted dynamic graph to quantify the continuum between unrestricted dMHFs (k=n) and iMHFs (k=1). For any ε > 0 we show how to construct a k-restricted dynamic graph with k=Ω(N^(1-ε)) that provably achieves maximum cumulative pebbling cost Ω(N^2). We can use k-restricted dynamic graphs to build a ciMHF provided that cache is large enough to hold k hash outputs and the dynamic graph satisfies a certain property that we call "amenable to shuffling". In particular, we prove that the induced memory access pattern is indistinguishable to a polynomial time attacker who can monitor the locations of read/write requests to RAM, but not cache. We also show that when k=o(N^(1/log log N)) , then any k-restricted graph with constant indegree has cumulative pebbling cost o(N^2). Our results almost completely characterize the spectrum of k-restricted dynamic graphs. 
    more » « less