skip to main content


Title: Performance Analysis of Scientific Computing Workloads on General Purpose TEEs
Scientific computing sometimes involves computation on sensitive data. Depending on the data and the execution environment, the HPC (high-performance computing) user or data provider may require confidentiality and/or integrity guarantees. To study the applicability of hardware-based trusted execution environments (TEEs) to enable secure scientific computing, we deeply analyze the performance impact of general purpose TEEs, AMD SEV, and Intel SGX, for diverse HPC benchmarks including traditional scientific computing, machine learning, graph analytics, and emerging scientific computing workloads. We observe three main findings: 1) SEV requires careful memory placement on large scale NUMA machines (1×– 3.4× slowdown without and 1×–1.15× slowdown with NUMA aware placement), 2) virtualization—a prerequisite for SEV— results in performance degradation for workloads with irregular memory accesses and large working sets (1×–4× slowdown compared to native execution for graph applications) and 3) SGX is inappropriate for HPC given its limited secure memory size and inflexible programming model (1.2×–126× slowdown over unsecure execution). Finally, we discuss forthcoming new TEE designs and their potential impact on scientific computing.  more » « less
Award ID(s):
1850566
NSF-PAR ID:
10253802
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
35th IEEE International Parallel & Distributed Processing Symposium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    An accurate sense of elapsed time is essential for the safe and correct operation of hardware, software, and networked systems. Unfortunately, an adversary can manipulate the system's time and violate causality, consistency, and scheduling properties of underlying applications. Although cryptographic techniques are used to secure data, they cannot ensure time security as securing a time source is much more challenging, given that the result of inquiring time must be delivered in a timely fashion. In this paper, we first describe general attack vectors that can compromise a system's sense of time. To counter these attacks, we propose a secure time architecture, TIMESEAL that leverages a Trusted Execution Environment (TEE) to secure time-based primitives. While CPU security features of TEEs secure code and data in protected memory, we show that time sources available in TEE are still prone to OS attacks. TIMESEAL puts forward a high-resolution time source that protects against the OS delay and scheduling attacks. Our TIMESEAL prototype is based on Intel SGX and provides sub-millisecond (msec) resolution as compared to 1-second resolution of SGX trusted time. It also securely bounds the relative time accuracy to msec under OS attacks. In essence, TIMESEAL provides the capability of trusted timestamping and trusted scheduling to critical applications in the presence of a strong adversary. It delivers all temporal use cases pertinent to secure sensing, computing, and actuating in networked systems. 
    more » « less
  2. An increasing number of Trusted Execution Environment (TEE) is adopting to a variety of commercial products for protecting data security on the cloud. However, TEEs are still exposed to various side-channel vulnerabilities, such as execution order-based, timing-based, and power-based vulnerabilities. While recent hardware is applying various techniques to mitigate order-based and timing-based side-channel vulnerabilities, power-based side-channel attacks remain a concern of hardware security, especially for the confidential computing settings where the server machines are beyond the control of cloud users. In this paper, we present PWRLEAK, an attack framework that exploits AMD’s power reporting interfaces to build power side-channel attacks against AMD Secure Encrypted Virtualization (SEV)-protected VM. We design and implement the attack framework with three general steps: (1) identify the instruction running inside AMD SEV, (2) apply a power interpolator to amplify power consumption, including an emulation-based interpolator for analyzing purposes and a moregeneral interrupt-based interpolator, and (3) infer secrets with various analysis approaches. A case study of using the emulation-based interpolator to infer the whole JPEG images processed by libjpeg demonstrates its ability to help analyze power consumption inside SEV VM. Our end-to-end attacks against Intel’s Integrated Performance Primitives (Intel IPP) library indicates that PWRLEAK can be exploited to infer RSA private keys with over 80% accuracy using the interrupt based interpolator. 
    more » « less
  3. Cycle-level architectural simulation of Trusted Execution Environments (TEEs) can enable extensive design space exploration of these secure architectures. Existing architectural simulators which support TEEs are either based on hardware-level implementations or abstract analytic models. In this paper, we describe the implementation of the gem5 models necessary to run and evaluate the RISCV-based open source TEE, Keystone, and we discuss how this simulation environment opens new avenues for designing and studying these trusted environments. We show that the Keystone simulations on gem5 exhibit similar performance as the previous hardware evaluations of Keystone. We also describe three simple example use cases (understanding the reason of trusted execution slowdown, performance of memory encryption, and micro-architecture impact on trusted execution performance) to demonstrate how the ability to simulate TEEs can provide useful information about their behavior in the existing form and also with enhanced designs. 
    more » « less
  4. As scaling of conventional memory devices has stalled, many high-end computing systems have begun to incorporate alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize the different types of memory efficiently, new data management strategies are needed to match application usage to the best available memory technology. However, current proposals for managing heterogeneous memories are limited, because they either (1) do not consider high-level application behavior when assigning data to different types of memory or (2) require separate program execution (with a representative input) to collect information about how the application uses memory resources. This work presents a new data management toolset to address the limitations of existing approaches for managing complex memories. It extends the application runtime layer with automated monitoring and management routines that assign application data to the best tier of memory based on previous usage, without any need for source code modification or a separate profiling run. It evaluates this approach on a state-of-the-art server platform with both conventional DDR4 SDRAM and non-volatile Intel Optane DC memory, using both memory-intensive high-performance computing (HPC) applications as well as standard benchmarks. Overall, the results show that this approach improves program performance significantly compared to a standard unguided approach across a variety of workloads and system configurations. The HPC applications exhibit the largest benefits, with speedups ranging from 1.4× to 7× in the best cases. Additionally, we show that this approach achieves similar performance as a comparable offline profiling-based approach after a short startup period, without requiring separate program execution or offline analysis steps. 
    more » « less
  5. COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among domain experts, mathematical modelers, and scientific computing specialists. Computationally, however, it also revealed critical gaps in the ability of researchers to exploit advanced computing systems. These challenging areas include gaining access to scalable computing systems, porting models and workflows to new systems, sharing data of varying sizes, and producing results that can be reproduced and validated by others. Informed by our team’s work in supporting public health decision makers during the COVID-19 pandemic and by the identified capability gaps in applying high-performance computing (HPC) to the modeling of complex social systems, we present the goals, requirements, and initial implementation of OSPREY, an open science platform for robust epidemic analysis. The prototype implementation demonstrates an integrated, algorithm-driven HPC workflow architecture, coordinating tasks across federated HPC resources, with robust, secure and automated access to each of the resources. We demonstrate scalable and fault-tolerant task execution, an asynchronous API to support fast time-to-solution algorithms, an inclusive, multi-language approach, and efficient wide-area data management. The example OSPREY code is made available on a public repository. 
    more » « less