NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Enabling Design Space Exploration for RISC-V Secure Compute Environments

Akram, Ayaz; Akella, Venkatesh; Peisert, Sean; Lowe-Power, Jason (June 2021, Fifth Workshop on Computer Architecture Research with RISC-V)

Cycle-level architectural simulation of Trusted Execution Environments (TEEs) can enable extensive design space exploration of these secure architectures. Existing architectural simulators which support TEEs are either based on hardware-level implementations or abstract analytic models. In this paper, we describe the implementation of the gem5 models necessary to run and evaluate the RISCV-based open source TEE, Keystone, and we discuss how this simulation environment opens new avenues for designing and studying these trusted environments. We show that the Keystone simulations on gem5 exhibit similar performance as the previous hardware evaluations of Keystone. We also describe three simple example use cases (understanding the reason of trusted execution slowdown, performance of memory encryption, and micro-architecture impact on trusted execution performance) to demonstrate how the ability to simulate TEEs can provide useful information about their behavior in the existing form and also with enhanced designs.
more » « less
Full Text Available
A Case Against Hardware Managed DRAM Caches for NVRAM Based Systems

https://doi.org/10.1109/ISPASS51385.2021.00036

Hildebrand, Mark; Angeles, Julian T.; Lowe-Power, Jason; Akella, Venkatesh (March 2021, 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS))
null (Ed.)
Non-volatile memory (NVRAM) based on phase-change memory (such as Optane DC Persistent Memory Module) is making its way into Intel servers to address the needs of emerging applications that have a huge memory footprint. These systems have both DRAM and NVRAM on the same memory channel with the smaller capacity DRAM serving as a cache to the larger capacity NVRAM in the so called 2LM mode. In this work we analyze the performance of such DRAM caches on real hardware using a broad range of synthetic and real-world benchmarks. We identify three key limitations of DRAM caches in these emerging systems which prevent large-scale, bandwidth bound applications from taking full advantage of NVRAM read and write bandwidth. We show that software based techniques are necessary for orchestrating the data movement between DRAM and PMM for such workloads to take full advantage of these new heterogeneous memory systems.
more » « less
Full Text Available
Performance Analysis of Scientific Computing Workloads on General Purpose TEEs

Akram, Ayaz; Giannakou, Anna; Akella, Venkatesh; Lowe-Power, Jason; Peisert, Sean (January 2021, 35th IEEE International Parallel & Distributed Processing Symposium)
null (Ed.)
Scientific computing sometimes involves computation on sensitive data. Depending on the data and the execution environment, the HPC (high-performance computing) user or data provider may require confidentiality and/or integrity guarantees. To study the applicability of hardware-based trusted execution environments (TEEs) to enable secure scientific computing, we deeply analyze the performance impact of general purpose TEEs, AMD SEV, and Intel SGX, for diverse HPC benchmarks including traditional scientific computing, machine learning, graph analytics, and emerging scientific computing workloads. We observe three main findings: 1) SEV requires careful memory placement on large scale NUMA machines (1×– 3.4× slowdown without and 1×–1.15× slowdown with NUMA aware placement), 2) virtualization—a prerequisite for SEV— results in performance degradation for workloads with irregular memory accesses and large working sets (1×–4× slowdown compared to native execution for graph applications) and 3) SGX is inappropriate for HPC given its limited secure memory size and inflexible programming model (1.2×–126× slowdown over unsecure execution). Finally, we discuss forthcoming new TEE designs and their potential impact on scientific computing.
more » « less
Full Text Available
AutoTM: Automatic Tensor Movement in Heterogeneous Memory Systems using Integer Linear Programming

https://doi.org/10.1145/3373376.3378465

Hildebrand, Mark; Khan, Jawad; Trika, Sanjeev; Lowe-Power, Jason; Akella, Venkatesh (March 2020, ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems)

Memory capacity is a key bottleneck for training large scale neural networks. Intel® Optane DC PMM (persistent memory modules) which are available as NVDIMMs are a disruptive technology that promises significantly higher read bandwidth than traditional SSDs at a lower cost per bit than traditional DRAM. In this work we show how to take advantage of this new memory technology to minimize the amount of DRAM required without compromising performance significantly. Specifically, we take advantage of the static nature of the underlying computational graphs in deep neural network applications to develop a profile guided optimization based on Integer Linear Programming (ILP) called AutoTM to optimally assign and move live tensors to either DRAM or NVDIMMs. Our approach can replace 50% to 80% of a system's DRAM with PMM while only losing a geometric mean 27.7% performance. This is a significant improvement over first-touch NUMA, which loses 71.9% of performance. The proposed ILP based synchronous scheduling technique also provides 2x performance over using DRAM as a hardware-controlled cache for very large networks.
more » « less
Full Text Available

Search for: All records