Side-channel attacks, such as Spectre and Meltdown, that leverage speculative execution pose a serious threat to computing systems. Worse yet, such attacks can be perpetrated by compromised operating system (OS) kernels to bypass defenses that protect applications from the OS kernel. This work evaluates the performance impact of three different defenses against in-kernel speculation side-channel attacks within the context of Virtual Ghost, a system that protects user data from compromised OS kernels: Intel MPX bounds checks, which require a memory fence; address bit-masking and testing, which creates a dependence between the bounds check and the load/store; and the use of separate virtual address spaces for applications, the OS kernel, and the Virtual Ghost virtual machine, forcing a speculation boundary. Our results indicate that an instrumentation-based bit-masking approach to protection incurs the least overhead by minimizing speculation boundaries. Our work also highlights possible improvements to Intel MPX that could help mitigate speculation side-channel attacks at a lower cost.
more »
« less
Scalable FSM parallelization via path fusion and higher-order speculation
Finite-state machine (FSM) is a fundamental computation model used by many applications. However, FSM execution is known to be “embarrassingly sequential” due to the state dependences among transitions. Existing solutions leverage enumerative or speculative parallelization to break the dependences. However, the efficiency of both parallelization schemes highly depends on the properties of the FSM and its inputs. For those exhibiting unfavorable properties, the former suffers from the overhead of maintaining multiple execution paths, while the latter is bottlenecked by the serial reprocessing among the misspeculation cases. Either way, the FSM parallelization scalability is seriously compromised. This work addresses the above scalability challenges with two novel techniques. First, for enumerative parallelization, it proposes path fusion. Inspired by the classic NFA to DFA conversion, it maps a vector of states in the original FSM to a new (fused) state. In this way, path fusion can reduce multiple FSM execution paths into a single path, minimizing the overhead of path maintenance. Second, for speculative parallelization, this work introduces higher-order speculation to avoid the serial reprocessing during validations. This is a generalized speculation model that allows speculated states to be validated speculatively. Finally, this work integrates different schemes of FSM parallelization into a framework—BoostFSM, which automatically selects the best based on the relevant properties of the FSM. Evaluation using real-world FSMs with diverse characteristics shows that BoostFSM can raise the average speedup from 3.1× and 15.4× of the existing speculative and enumerative parallelization schemes, respectively, to 25.8× on a 64-core machine.
more »
« less
- Award ID(s):
- 1751392
- PAR ID:
- 10222360
- Date Published:
- Journal Name:
- Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’21)
- Page Range / eLocation ID:
- 887 to 901
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Machine learning (ML) training is commonly parallelized using data parallelism. A fundamental limitation of data parallelism is that conflicting (concurrent) parameter accesses during ML training usually diminishes or even negates the benefits provided by additional parallel compute resources. Although it is possible to avoid conflicting parameter accesses by carefully scheduling the computation, existing systems rely on programmer manual parallelization and it remains a question when such parallelization is possible. We present Orion, a system that automatically parallelizes serial imperative ML programs on distributed shared memory. The core of Orion is a static dependence analysis mechanism that determines when dependence-preserving parallelization is effective and maps a loop computation to an optimized distributed computation schedule. Our evaluation shows that for a number of ML applications, Orion can parallelize a serial program while preserving critical dependences and thus achieve a significantly faster convergence rate than data-parallel programs and a matching convergence rate and comparable computation throughput to state-of-the-art manual parallelizations including model-parallel programs.more » « less
-
The high-profile Spectre attack and its variants have revealed that speculative execution may leave secret-dependent footprints in the cache, allowing an attacker to learn confidential data. However, existing static side-channel detectors either ignore speculative execution, leading to false negatives, or lack a precise cache model, leading to false positives. In this paper, somewhat surprisingly, we show that it is challenging to develop a speculation-aware static analysis with precise cache models: a combination of existing works does not necessarily catch all cache side channels. Motivated by this observation, we present a new semantic definition of security against cache-based side-channel attacks, called Speculative-Aware noninterference (SANI), which is applicable to a variety of attacks and cache models. We also develop SpecSafe to detect the violations of SANI. Unlike other speculation-aware symbolic executors, SpecSafe employs a novel program transformation so that SANI can be soundly checked by speculation-unaware side-channel detectors. SpecSafe is shown to be both scalable and accurate on a set of moderately sized benchmarks, including commonly used cryptography libraries.more » « less
-
null (Ed.)We introduce Blade, a new approach to automatically and efficiently eliminate speculative leaks from cryptographic code. Blade is built on the insight that to stop leaks via speculative execution, it suffices to cut the dataflow from expressions that speculatively introduce secrets ( sources ) to those that leak them through the cache ( sinks ), rather than prohibit speculation altogether. We formalize this insight in a static type system that (1) types each expression as either transient , i.e., possibly containing speculative secrets or as being stable , and (2) prohibits speculative leaks by requiring that all sink expressions are stable. Blade relies on a new abstract primitive, protect , to halt speculation at fine granularity. We formalize and implement protect using existing architectural mechanisms, and show how Blade’s type system can automatically synthesize a minimal number of protect s to provably eliminate speculative leaks. We implement Blade in the Cranelift WebAssembly compiler and evaluate our approach by repairing several verified, yet vulnerable WebAssembly implementations of cryptographic primitives. We find that Blade can fix existing programs that leak via speculation automatically , without user intervention, and efficiently even when using fences to implement protect .more » « less
-
Speculative execution attacks leverage the speculative and out-of-order execution features in modern computer processors to access secret data or execute code that should not be executed. Secret information can then be leaked through a covert channel. While software patches can be installed for mitigation on existing hardware, these solutions can incur big performance overhead. Hardware mitigation is being studied extensively by the computer architecture community. It has the benefit of preserving software compatibility and the potential for much smaller performance overhead than software solutions. This paper presents a systematization of the hardware defenses against speculative execution attacks that have been proposed. We show that speculative execution attacks consist of 6 critical attack steps. We propose defense strategies, each of which prevents a critical attack step from happening, thus preventing the attack from succeeding. We then summarize 20 hardware defenses and overhead-reducing features that have been proposed. We show that each defense proposed can be classified under one of our defense strategies, which also explains why it can thwart the attack from succeeding. We discuss the scope of the defenses, their performance overhead, and the security-performance trade-offs that can be made.more » « less
An official website of the United States government

