skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Jumanji: The Case for Dynamic NUCA in the Datacenter
The datacenter introduces new challenges for computer systems around tail latency and security. This paper argues that dynamic NUCA techniques are a better solution to these challenges than prior cache designs. We show that dynamic NUCA designs can meet tail-latency deadlines with much less cache space than prior work, and that they also provide a natural defense against cache attacks. Unfortunately, prior dynamic NUCAs have missed these opportunities because they focus exclusively on reducing data movement.We present Jumanji, a dynamic NUCA technique designed for tail latency and security. We show that prior last-level cache designs are vulnerable to new attacks and offer imperfect performance isolation. Jumanji solves these problems while significantly improving performance of co-running batch applications. Moreover, Jumanji only requires lightweight hardware and a few simple changes to system software, similar to prior D-NUCAs. At 20 cores, Jumanji improves batch weighted speedup by 14% on average, vs. just 2% for a non-NUCA design with weaker security, and is within 2% of an idealized design.  more » « less
Award ID(s):
1845986
PAR ID:
10281096
Author(s) / Creator(s):
;
Date Published:
Journal Name:
MICRO
Page Range / eLocation ID:
665 to 680
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Randomization has proven to be a good defense against conflictbased side-channel attacks in a shared cache. It improves security by assigning a unique randomization scheme to each security domain, e.g., though a different hashing function. However, if two domains have shared data, the domains must be fused in order to guarantee correctness (i.e., data coherence). Such domain fusion significantly reduces the effectiveness of randomization and weakens its security protection. We propose randomization with sharing (RAWS), which enables secure cross-domain accesses while enforcing cache coherence (and thus data coherence). Based on RAWS, we design a non-fusion based inter-domain coherence protocol (NF-IDCP). NF-IDCP enables cache coherence by looking up and flushing multiple cache lines associated with shared-writable data during their cross-domain accesses. Furthermore, NF-IDCP uses constant-delay banking to securely reduce the latency of the cache line flushes. We also use a secure tag-based filter (STF) to reduce flush costs, for example, by explicitly storing the exact cache locations to be flushed. The security evaluation shows that conflict attacks on the optimized NF-IDCP structures cannot leak conflict observations at a meaningful rate. Attack simulations using CacheFX demonstrate that domain fusion significantly retards the protection provided by randomization schemes. Performance overhead of SPECrate 2017 and PARSEC 3.0 benchmarks is evaluated on ZSim, a microarchitectural simulator. To study the performance impact on realistic workloads, such as Firefox, Chromium and X Server, we use a cache simulator built on top of PANDA, a full-system emulator. Across all configurations, the average performance overhead is less than 5%, and the hardware overhead is less than 3% compared to a domainfused randomization. 
    more » « less
  2. Typical cybersecurity solutions emphasize on achieving defense functionalities. However, execution efficiency and scalability are equally important, especially for real-world deployment. Straightforward mappings of cybersecurity applications onto HPC platforms may significantly underutilize the HPC devices’ capacities. On the other hand, the sophisticated implementations are quite difficult: they require both in-depth understandings of cybersecurity domain-specific characteristics and HPC architecture and system model. In our work, we investigate three sub-areas in cybersecurity, including mobile software security, network security, and system security. They have the following performance issues, respectively: 1) The flow- and context-sensitive static analysis for the large and complex Android APKs are incredibly time-consuming. Existing CPU-only frameworks/tools have to set a timeout threshold to cease the program analysis to trade the precision for performance. 2) Network intrusion detection systems (NIDS) use automata processing as its searching core and requires line-speed processing. However, achieving high-speed automata processing is exceptionally difficult in both algorithm and implementation aspects. 3) It is unclear how the cache configurations impact time-driven cache side-channel attacks’ performance. This question remains open because it is difficult to conduct comparative measurement to study the impacts. In this dissertation, we demonstrate how application-specific characteristics can be leveraged to optimize implementations on various types of HPC for faster and more scalable cybersecurity executions. For example, we present a new GPU-assisted framework and a collection of optimization strategies for fast Android static data-flow analysis that achieve up to 128X speedups against the plain GPU implementation. For network intrusion detection systems (IDS), we design and implement an algorithm capable of eliminating the state explosion in out-of-order packet situations, which reduces up to 400X of the memory overhead. We also present tools for improving the usability of Micron’s Automata Processor. To study the cache configurations’ impact on time-driven cache side-channel attacks’ performance, we design an approach to conducting comparative measurement. We propose a quantifiable success rate metric to measure the performance of time-driven cache attacks and utilize the GEM5 platform to emulate the configurable cache. 
    more » « less
  3. null (Ed.)
    An attacker can obtain a valid TLS certificate for a domain by hijacking communication between a certificate authority (CA) and a victim domain. Performing domain validation from multiple vantage points can defend against these attacks. We explore the design space of multi-vantage-point domain validation to achieve (1) security via sufficiently diverse vantage points, (2) performance by ensuring low latency and overhead in certificate issuance, (3) manageability by complying with CA/Browser forum requirements, and requiring minimal changes to CA operations, and (4) a low benign failure rate for legitimate requests. Our opensource implementation was deployed by the Let's Encrypt CA in February 2020, and has since secured the issuance of more than half a billion certificates during the first year of its deployment. Using real-world operational data from Let's Encrypt, we show that our approach has negligible latency and communication overhead, and a benign failure rate comparable to conventional designs with one vantage point. Finally, we evaluate the security improvements using a combination of ethically conducted real-world BGP hijacks, Internet-scale traceroute experiments, and a novel BGP simulation framework. We show that multi-vantage-point domain validation can thwart the vast majority of BGP attacks. Our work motivates the deployment of multi-vantage-point domain validation across the CA ecosystem to strengthen TLS certificate issuance and user privacy. 
    more » « less
  4. An attacker can obtain a valid TLS certificate for a domain by hijacking communication between a certificate authority (CA) and a victim domain. Performing domain validation from multiple vantage points can defend against these attacks. We explore the design space of multi-vantage-point domain validation to achieve (1) security via sufficiently diverse vantage points, (2) performance by ensuring low latency and overhead in certificate issuance, (3) manageability by complying with CA/Browser forum requirements, and requiring minimal changes to CA operations, and (4) a low benign failure rate for legitimate requests. Our open- source implementation was deployed by the Let’s Encrypt CA in February 2020, and has since secured the issuance of more than half a billion certificates during the first year of its deployment. Using real-world operational data from Let’s Encrypt, we show that our approach has negligible latency and communication overhead, and a benign failure rate comparable to conventional designs with one vantage point. Finally, we evaluate the security improvements using a combination of ethically conducted real-world BGP hijacks, Internet-scale traceroute experiments, and a novel BGP simulation framework. We show that multi-vantage-point domain validation can thwart the vast majority of BGP attacks. Our work motivates the deployment of multi-vantage-point domain validation across the CA ecosystem to strengthen TLS certificate issuance and user privacy. 
    more » « less
  5. Serverless computing is a new pay-per-use cloud service paradigm that automates resource scaling for stateless functions and can potentially facilitate bursty machine learning serving. Batching is critical for latency performance and cost-effectiveness of machine learning inference, but unfortunately it is not supported by existing serverless platforms due to their stateless design. Our experiments show that without batching, machine learning serving cannot reap the benefits of serverless computing. In this paper, we present BATCH, a framework for supporting efficient machine learning serving on serverless platforms. BATCH uses an optimizer to provide inference tail latency guarantees and cost optimization and to enable adaptive batching support. We prototype BATCH atop of AWS Lambda and popular machine learning inference systems. The evaluation verifies the accuracy of the analytic optimizer and demonstrates performance and cost advantages over the state-of-the-art method MArk and the state-of-the-practice tool SageMaker. 
    more » « less