Cycle-level architectural simulation of Trusted Execution Environments (TEEs) can enable extensive design space exploration of these secure architectures. Existing architectural simulators which support TEEs are either based on hardware-level implementations or abstract analytic models. In this paper, we describe the implementation of the gem5 models necessary to run and evaluate the RISCV-based open source TEE, Keystone, and we discuss how this simulation environment opens new avenues for designing and studying these trusted environments. We show that the Keystone simulations on gem5 exhibit similar performance as the previous hardware evaluations of Keystone. We also describe three simple example use cases (understanding the reason of trusted execution slowdown, performance of memory encryption, and micro-architecture impact on trusted execution performance) to demonstrate how the ability to simulate TEEs can provide useful information about their behavior in the existing form and also with enhanced designs. 
                        more » 
                        « less   
                    
                            
                            Performance Analysis of Scientific Computing Workloads on General Purpose TEEs
                        
                    
    
            Scientific computing sometimes involves computation on sensitive data. Depending on the data and the execution environment, the HPC (high-performance computing) user or data provider may require confidentiality and/or integrity guarantees. To study the applicability of hardware-based trusted execution environments (TEEs) to enable secure scientific computing, we deeply analyze the performance impact of general purpose TEEs, AMD SEV, and Intel SGX, for diverse HPC benchmarks including traditional scientific computing, machine learning, graph analytics, and emerging scientific computing workloads. We observe three main findings: 1) SEV requires careful memory placement on large scale NUMA machines (1×– 3.4× slowdown without and 1×–1.15× slowdown with NUMA aware placement), 2) virtualization—a prerequisite for SEV— results in performance degradation for workloads with irregular memory accesses and large working sets (1×–4× slowdown compared to native execution for graph applications) and 3) SGX is inappropriate for HPC given its limited secure memory size and inflexible programming model (1.2×–126× slowdown over unsecure execution). Finally, we discuss forthcoming new TEE designs and their potential impact on scientific computing. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1850566
- PAR ID:
- 10253802
- Date Published:
- Journal Name:
- 35th IEEE International Parallel & Distributed Processing Symposium
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)An accurate sense of elapsed time is essential for the safe and correct operation of hardware, software, and networked systems. Unfortunately, an adversary can manipulate the system's time and violate causality, consistency, and scheduling properties of underlying applications. Although cryptographic techniques are used to secure data, they cannot ensure time security as securing a time source is much more challenging, given that the result of inquiring time must be delivered in a timely fashion. In this paper, we first describe general attack vectors that can compromise a system's sense of time. To counter these attacks, we propose a secure time architecture, TIMESEAL that leverages a Trusted Execution Environment (TEE) to secure time-based primitives. While CPU security features of TEEs secure code and data in protected memory, we show that time sources available in TEE are still prone to OS attacks. TIMESEAL puts forward a high-resolution time source that protects against the OS delay and scheduling attacks. Our TIMESEAL prototype is based on Intel SGX and provides sub-millisecond (msec) resolution as compared to 1-second resolution of SGX trusted time. It also securely bounds the relative time accuracy to msec under OS attacks. In essence, TIMESEAL provides the capability of trusted timestamping and trusted scheduling to critical applications in the presence of a strong adversary. It delivers all temporal use cases pertinent to secure sensing, computing, and actuating in networked systems.more » « less
- 
            An increasing number of Trusted Execution Environment (TEE) is adopting to a variety of commercial products for protecting data security on the cloud. However, TEEs are still exposed to various side-channel vulnerabilities, such as execution order-based, timing-based, and power-based vulnerabilities. While recent hardware is applying various techniques to mitigate order-based and timing-based side-channel vulnerabilities, power-based side-channel attacks remain a concern of hardware security, especially for the confidential computing settings where the server machines are beyond the control of cloud users. In this paper, we present PWRLEAK, an attack framework that exploits AMD’s power reporting interfaces to build power side-channel attacks against AMD Secure Encrypted Virtualization (SEV)-protected VM. We design and implement the attack framework with three general steps: (1) identify the instruction running inside AMD SEV, (2) apply a power interpolator to amplify power consumption, including an emulation-based interpolator for analyzing purposes and a moregeneral interrupt-based interpolator, and (3) infer secrets with various analysis approaches. A case study of using the emulation-based interpolator to infer the whole JPEG images processed by libjpeg demonstrates its ability to help analyze power consumption inside SEV VM. Our end-to-end attacks against Intel’s Integrated Performance Primitives (Intel IPP) library indicates that PWRLEAK can be exploited to infer RSA private keys with over 80% accuracy using the interrupt based interpolator.more » « less
- 
            The increasing popularity and ubiquity of various large graph datasets has caused renewed interest for graph partitioning. Existing graph partitioners either scale poorly against large graphs or disregard the impact of the underlying hardware topology. A few solutions have shown that the nonuniform network communication costs may affect the performance greatly. However, none of them considers the impact of resource contention on the memory subsystems (e.g., LLC and Memory Controller) of modern multicore clusters. They all neglect the fact that the bandwidth of modern high-speed networks (e.g., Infiniband) has become comparable to that of the memory subsystems. In this paper, we provide an in-depth analysis, both theoretically and experimentally, on the contention issue for distributed workloads. We found that the slowdown caused by the contention can be as high as 11x. We then design an architecture-aware graph partitioner, ARGO , to allow the full use of all cores of multicore machines without suffering from either the contention or the communication heterogeneity issue. Our experimental study showed (1) the effectiveness of ARGO , achieving up to 12x speedups on three classic workloads: Breadth First Search, Single Source Shortest Path, and PageRank; and (2) the scalability of ARGO in terms of both graph size and the number of partitions on two billion-edge real-world graphs.more » « less
- 
            Trusted execution environments (TEEs) have been proposed to protect GPU computation for machine learning applications operating on sensitive data. However, existing GPU TEE solutions either require CPU and/or GPU hardware modification to realize TEEs for GPUs, which prevents current systems from adopting them, or rely on untrusted system software such as GPU device drivers. In this paper, we propose using CPU secure enclaves, e.g., Intel SGX, to build GPU TEEs without modifications to existing hardware. To tackle the fundamental limitations of these enclaves, such as no support for I/O operations, we design and develop GEVisor, a formally verified security reference monitor software to enable a trusted I/O path between enclaves and GPU without trusting the GPU device driver. GEVisor operates in the Virtual Machine Extension (VMX) root mode, monitors the host system software to prevent unauthorized access to the GPU code and data outside the enclave, and isolates the enclave GPU context from other contexts during GPU computation. We implement and evaluate GEVisor on a commodity machine with an Intel SGX CPU and an NVIDIA Pascal GPU. Our experimental results show that our approach maintains an average overhead of 13.1% for deep learning and 18% for GPU benchmarks compared to native GPU computation while providing GPU TEEs for existing CPU and GPU hardware.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    