Most programs compiled to WebAssembly (Wasm) today are written in unsafe languages like C and C++. Unfortunately, memory-unsafe C code remains unsafe when compiled to Wasm—and attackers can exploit buffer overflows and use-after-frees in Wasm almost as easily as they can on native platforms. Memory- Safe WebAssembly (MSWasm) proposes to extend Wasm with language-level memory-safety abstractions to precisely address this problem. In this paper, we build on the original MSWasm position paper to realize this vision. We give a precise and formal semantics of MSWasm, and prove that well-typed MSWasm programs are, by construction, robustly memory safe. To this end, we develop a novel, language-independent memory-safety property based on colored memory locations and pointers. This property also lets us reason about the security guarantees of a formal C-to-MSWasm compiler—and prove that it always produces memory-safe programs (and preserves the semantics of safe programs). We use these formal results to then guide several implementations: Two compilers of MSWasm to native code, and a C-to-MSWasm compiler (that extends Clang). Our MSWasm compilers support different enforcement mechanisms, allowing developers to make security-performance trade-offs according to their needs. Our evaluation shows that on the PolyBenchC suite, the overhead of enforcing memory safety in software ranges from 22% (enforcing spatial safety alone) to 198% (enforcing full memory safety), and 51.7% when using hardware memory capabilities for spatial safety and pointer integrity. More importantly, MSWasm’s design makes it easy to swap between enforcement mechanisms; as fast (especially hardware-based) enforcement techniques become available, MSWasm will be able to take advantage of these advances almost for free.
more »
« less
A Formal Model of Checked C
We present a formal model of Checked C, a dialect of C that aims to enforce spatial memory safety. Our model pays particular attention to the semantics of dynamically sized, potentially null-terminated arrays. We formalize this model in Coq, and prove that any spatial memory safety errors can be blamed on portions of the program labeled unchecked; this is a Checked C feature that supports incremental porting and backward compatibility. While our model's operational semantics uses annotated (“fat”) pointers to enforce spatial safety, we show that such annotations can be safely erased. Using PLT Redex we formalize an executable version of our model and a compilation procedure to an untyped C-like language, as well as use randomized testing to validate that generated code faithfully simulates the original. Finally, we develop a custom random generator for well-typed and almost-well-typed terms in our Redex model, and use it to search for inconsistencies between our model and the Clang Checked C implementation. We find these steps to be a useful way to co-develop a language (Checked C is still in development) and a core model of it.
more »
« less
- Award ID(s):
- 1846350
- PAR ID:
- 10443111
- Date Published:
- Journal Name:
- 2022 IEEE 35th Computer Security Foundations Symposium (CSF)
- Page Range / eLocation ID:
- 49 to 63
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We present a gradually typed language, GrEff, with effects and handlers that supports migration from unchecked to checked effect typing. This serves as a simple model of the integration of an effect typing discipline with an existing effectful typed language that does not track fine-grained effect information. Our language supports a simple module system to model the programming model of gradual migration from unchecked to checked effect typing in the style of Typed Racket. The surface language GrEff is given semantics by elaboration to a core language Core GrEff. We equip Core GrEff with an inequational theory for reasoning about the semantic error ordering and desired program equivalences for programming with effects and handlers. We derive an operational semantics for the language from the equations provable in the theory. We then show that the theory is sound by constructing an operational logical relations model to prove the graduality theorem. This extends prior work on embedding-projection pair models of gradual typing to handle effect typing and subtyping.more » « less
-
Model checking has often been used for verifying Cyber-Physical Systems (CPS). A major challenge is how to capture a model that represents the actual behavior of the software. Model extraction can introduce errors that can affect the accuracy of the analysis including loss of precision, inconsistency, non-conformance, and over- and under-approximations.In this paper, we formalize and prove the correctness of extracting a model from a subset of the MicroPython programming language with respect to a trace-based semantics. The extracted models capture the order of method calls and can be model checked using Shelley. We formalize the extraction process from an intermediate representation of MicroPython codes and prove that the behavior of our intermediate representation is a regular language. Our formalization and theoretical results are fully mechanized using the Coq proof assistant.more » « less
-
null (Ed.)Abstract Gradually typed languages are designed to support both dynamically typed and statically typed programming styles while preserving the benefits of each. Sound gradually typed languages dynamically check types at runtime at the boundary between statically typed and dynamically typed modules. However, there is much disagreement in the gradual typing literature over how to enforce complex types such as tuples, lists, functions and objects. In this paper, we propose a new perspective on the design of runtime gradual type enforcement: runtime type casts exist precisely to ensure the correctness of certain type-based refactorings and optimizations. For instance, for simple types, a language designer might desire that beta-eta equality is valid. We show that this perspective is useful by demonstrating that a cast semantics can be derived from beta-eta equality. We do this by providing an axiomatic account program equivalence in a gradual cast calculus in a logic we call gradual type theory (GTT). Based on Levy’s call-by-push-value, GTT allows us to axiomatize both call-by-value and call-by-name gradual languages. We then show that we can derive the behavior of casts for simple types from the corresponding eta equality principle and the assumption that the language satisfies a property called graduality , also known as the dynamic gradual guarantee. Since we can derive the semantics from the assumption of eta equality, we also receive a useful contrapositive: any observably different cast semantics that satisfies graduality must violate the eta equality. We show the consistency and applicability of our axiomatic theory by proving that a contract-based implementation using the lazy cast semantics gives a logical relations model of our type theory, where equivalence in GTT implies contextual equivalence of the programs. Since GTT also axiomatizes the dynamic gradual guarantee, our model also establishes this central theorem of gradual typing. The model is parameterized by the implementation of the dynamic types, and so gives a family of implementations that validate type-based optimization and the gradual guarantee.more » « less
-
Garbage collectors provide memory safety, an important step toward program correctness. However, correctness of the collector itself can be challenging to establish, given both the style in which such systems are written and the weakly-ordered memory accesses of modern hardware. One way to maximize benefits is to use a framework in which effort can be focused on the correctness of small, modular critical components from which various collectors may be composed. Full proof of correctness is likely impractical, so we propose to gain a degree of confidence in collector correctness by applying model checking to critical kernels within a garbage collection framework. We further envisage a model framework, paralleling the framework nature of the collector, in hope that it will be easy to create new models for new collectors. We describe here a prototype model structure, and present results of model checking both stop-the-world and snapshot-at-the-beginning concurrent marking. We found useful regularities of model structure, and that models could be checked within possible time and space budgets on capable servers. This suggests that collectors built in a modular style might be model checked, and further that it may be worthwhile to develop a model checking framework with a domain-specific language from which to generate those models.more » « less
An official website of the United States government

