skip to main content


Title: Exploring Branch Predictors for Constructing Transient Execution Trojans
Transient execution is one of the most critical features used in CPUs to achieve high performance. Recent Spectre attacks demonstrated how this feature can be manipulated to force applications to reveal sensitive data. The industry quickly responded with a series of software and hardware mitigations among which microcode patches are the most prevalent and trusted. In this paper, we argue that currently deployed protections still leave room for constructing attacks. We do so by presenting transient trojans, software modules that conceal their malicious activity within transient execution mode. They appear completely benign, pass static and dynamic analysis checks, but reveal sensitive data when triggered. To construct these trojans, we perform a detailed analysis of the attack surface currently present in today's systems with respect to the recommended mitigation techniques. We reverse engineer branch predictors in several recent x86_64 processors which allows us to uncover previously unknown exploitation techniques. Using these techniques, we construct three types of transient trojans and demonstrate their stealthiness and practicality.  more » « less
Award ID(s):
1850365
NSF-PAR ID:
10188671
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems
Page Range / eLocation ID:
667 to 682
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Transient execution attacks such as Spectre and Meltdown exploit speculative execution in modern microprocessors to leak information via cache side‐channels. Software solutions to defend against many transient execution attacks employ thelfenceserialising instruction, which does not allow instructions that come after thelfenceto execute out‐of‐order with respect to instructions that come before thelfence. However, errors and Trojans in the hardware implementation oflfencecan be exploited to compromise the software mitigations that uselfence. The aforementioned security gap has not been identified and addressed previously. The authors provide a formal method solution that addresses the verification oflfencehardware implementation. The authors also show how hardware Trojans can be designed to circumventlfenceand demonstrate that their verification approach will flag such Trojans as well. The authors have demonstrated the efficacy of our approach using RSD, which is an open source RISC‐V based superscalar out‐of‐order processor.

     
    more » « less
  2. null (Ed.)
    In early 2018, Meltdown first showed how to read arbitrary kernel memory from user space by exploiting side-effects from transient instructions. While this attack has been mitigated through stronger isolation boundaries between user and kernel space, Meltdown inspired an entirely new class of fault-driven transient-execution attacks. Particularly, over the past year, Meltdown-type attacks have been extended to not only leak data from the L1 cache but also from various other microarchitectural structures, including the FPU register file and store buffer. In this paper, we present the ZombieLoad attack which uncovers a novel Meltdown-type effect in the processor’s fill-buffer logic. Our analysis shows that faulting load instructions (i.e., loads that have to be re-issued) may transiently dereference unauthorized destinations previously brought into the fill buffer by the current or a sibling logical CPU. In contrast to concurrent attacks on the fill buffer, we are the first to report data leakage of recently loaded and stored stale values across logical cores even on Meltdown- and MDS-resistant processors. Hence, despite Intel’s claims [36], we show that the hardware fixes in new CPUs are not sufficient. We demonstrate ZombieLoad’s effectiveness in a multitude of practical attack scenarios across CPU privilege rings, OS processes, virtual machines, and SGX enclaves. We discuss both short and long-term mitigation approaches and arrive at the conclusion that disabling hyperthreading is the only possible workaround to prevent at least the most-powerful cross-hyperthread attack scenarios on current processors, as Intel’s software fixes are incomplete. 
    more » « less
  3. As control-flow protection techniques are widely deployed, it is difficult for attackers to modify control data, like function pointers, to hijack program control flow. Instead, data-only attacks corrupt security-critical non-control data (critical data), and can bypass all control-flow protections to revive severe attacks. Previous works have explored various methods to help construct or prevent data-only attacks. However, no solution can automatically detect program-specific critical data. In this paper, we identify an important category of critical data, syscall-guard variables, and propose a set of solutions to automatically detect such variables in a scalable manner. Syscall-guard variables determine to invoke security-related system calls (syscalls), and altering them will allow attackers to request extra privileges from the operating system. We propose branch force, which intentionally flips every conditional branch during the execution and checks whether new security-related syscalls are invoked. If so, we conduct data-flow analysis to estimate the feasibility to flip such branches through common memory errors. We build a tool, VIPER, to implement our ideas. VIPER successfully detects 34 previously unknown syscall-guard variables from 13 programs. We build four new data-only attacks on sqlite and v8, which execute arbitrary command or delete arbitrary file. VIPER completes its analysis within five minutes for most programs, showing its practicality for spotting syscall-guard variables. 
    more » « less
  4. Modern integrated circuits (ICs) possess several countermeasures to safeguard sensitive data and information stored in the device. In recent years, semi-invasive physical attacks based on optical debugging techniques have proven to be capable of easily bypassing these security measures implemented in the chip. Optical attacks can reveal the data stored in memory, cache and register through various methods such as photon emission analysis, laser fault injection, laser voltage probing, and thermal laser stimulation. The above-mentioned methods, which employ laser scanning microscopy and photon emission microscopy, are effective because the silicon substrate is transparent to near-infrared (NIR) photons. Therefore, the most vulnerable part of an IC to optical attacks is the backside, where the chip's transistors can be accessed and probed with a NIR laser beam. Although different optical attack detection and … 
    more » « less
  5. Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Recently, GPUs have also shown significant speedups for a variety of security-sensitive applications such as encryptions. These speedups have largely benefited from the high memory bandwidth and compute throughput of GPUs. One of the key features to optimize the memory bandwidth consumption in GPUs is intra-warp memory access coalescing, which merges memory requests originating from different threads of a single warp into as few cache lines as possible. However, this coalescing feature is also shown to make the GPUs prone to the correlation timing attacks as it exposes the relationship between the execution time and the number of coalesced accesses. Consequently, an attacker is able to correctly reveal an AES private key via repeatedly gathering encrypted data and execution time on a GPU. In this work, we propose a series of defense mechanisms to alleviate such timing attacks by carefully trading off performance for improved security. Specifically, we propose to randomize the coalescing logic such that the attacker finds it hard to guess the correct number of coalesced accesses generated. To this end, we propose to randomize: a) the granularity (called as subwarp) at which warp threads are grouped together for coalescing, and b) the threads selected by each subwarp for coalescing. Such randomization techniques result in three mechanisms: fixed-sized subwarp (FSS), random-sized subwarp (RSS), and random-threaded subwarp (RTS). We find that the combination of these security mechanisms offers 24- to 961-times improvement in the security against the correlation timing attacks with 5 to 28% performance degradation. Online copy: http://adwaitjog.github.io/docs/pdf/rcoal-hpca18.pdf 
    more » « less