A key ingredient contributing to the success of CompCert, the state-of-the-art verified compiler for C, is its block-based memory model, which is used uniformly for all of its languages and their verified compilation. However, CompCert's memory model lacks an explicit notion of stack. Its target assembly language represents the runtime stack as an unbounded list of memory blocks, making further compilation of CompCert assembly into more realistic machine code difficult since it is not possible to merge these blocks into a finite and continuous stack. Furthermore, various notions of verified compositional compilation rely on some kind of mechanism for protecting private stack data and enabling modification to the public stack-allocated data, which is lacking in the original CompCert. These problems have been investigated but not fully addressed before, in the sense that some advanced optimization passes that significantly change the ways stack blocks are (de-)allocated, such as tailcall recognition and inlining, are often omitted. We propose a lightweight and complete solution to the above problems. It is based on the enrichment of CompCert's memory model with an abstract stack that keeps track of the history of stack frames to bound the stack consumption and that enforces a uniform stack access policy by assigning fine-grained permissions to stack memory. Using this enriched memory model for all the languages of CompCert, we are able to reprove the correctness of the full compilation chain of CompCert, including all the optimization passes. In the end, we get Stack-Aware CompCert, a complete extension of CompCert that enforces the finiteness of the stack and fine-grained stack permissions. Based on Stack-Aware CompCert, we develop CompCertMC, the first extension of CompCert that compiles into a low-level language with flat memory spaces. Based on CompCertMC, we develop Stack-Aware CompCertX, a complete extension of CompCert that supports a notion of compositional compilation that we call contextual compilation by exploiting the uniform stack access policy provided by the abstract stack.
more »
« less
Verified compilation of C programs with a nominal memory model
Memory models play an important role in verified compilation of imperative programming languages. A representative one is the block-based memory model of CompCert---the state-of-the-art verified C compiler. Despite its success, the abstraction over memory space provided by CompCert's memory model is still primitive and inflexible. In essence, it uses a fixed representation for identifying memory blocks in a global memory space and uses a globally shared state for distinguishing between used and unused blocks. Therefore, any reasoning about memory must work uniformly for the global memory; it is impossible to individually reason about different sub-regions of memory (i.e., the stack and global definitions). This not only incurs unnecessary complexity in compiler verification, but also poses significant difficulty for supporting verified compilation of open or concurrent programs which need to work with contextual memory, as manifested in many previous extensions of CompCert. To remove the above limitations, we propose an enhancement to the block-based memory model based on nominal techniques; we call it the nominal memory model. By adopting the key concepts of nominal techniques such as atomic names and supports to model the memory space, we are able to 1) generalize the representation of memory blocks to any types satisfying the properties of atomic names and 2) remove the global constraints for managing memory blocks, enabling flexible memory structures for open and concurrent programs. To demonstrate the effectiveness of the nominal memory model, we develop a series of extensions of CompCert based on it. These extensions show that the nominal memory model 1) supports a general framework for verified compilation of C programs, 2) enables intuitive reasoning of compiler transformations on partial memory; and 3) enables modular reasoning about programs working with contextual memory. We also demonstrate that these extensions require limited changes to the original CompCert, making the verification techniques based on the nominal memory model easy to adopt.
more »
« less
- PAR ID:
- 10602891
- Publisher / Repository:
- Association for Computing Machinery (ACM)
- Date Published:
- Journal Name:
- Proceedings of the ACM on Programming Languages
- Volume:
- 6
- Issue:
- POPL
- ISSN:
- 2475-1421
- Format(s):
- Medium: X Size: p. 1-31
- Size(s):
- p. 1-31
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Verified compilation of open modules (i.e., modules whose functionality depends on other modules) provides a foundation for end-to-end verification of modular programs ubiquitous in contemporary software. However, despite intensive investigation in this topic for decades, the proposed approaches are still difficult to use in practice as they rely on assumptions about the internal working of compilers which make it difficult for external users to apply the verification results. We propose an approach to verified compositional compilation without such assumptions in the setting of verifying compilation of heterogeneous modules written in first-order languages supporting global memory and pointers. Our approach is based on the memory model of CompCert and a new discovery that a Kripke relation with a notion of memory protection can serve as a uniform and composable semantic interface for the compiler passes. By absorbing the rely-guarantee conditions on memory evolution for all compiler passes into this Kripke Memory Relation and by piggybacking requirements on compiler optimizations onto it, we get compositional correctness theorems for realistic optimizing compilers as refinements that directly relate native semantics of open modules and that are ignorant of intermediate compilation processes. Such direct refinements support all the compositionality and adequacy properties essential for verified compilation of open modules. We have applied this approach to the full compilation chain of CompCert with its Clight source language and demonstrated that our compiler correctness theorem is open to composition and intuitive to use with reduced verification complexity through end-to-end verification of non-trivial heterogeneous modules that may freely invoke each other (e.g., mutually recursively).more » « less
-
Yoshida, Nobuko (Ed.)Modularity - the partitioning of software into units of functionality that interact with each other via interfaces - has been the mainstay of software development for half a century. In case of the C language, the main mechanism for modularity is the compilation unit / header file abstraction. This paper complements programmatic modularity for C with modularity idioms for specification and verification in the context of Verifiable C, an expressive separation logic for CompCert Clight. Technical innovations include (i) abstract predicate declarations – existential packages that combine Parkinson & Bierman’s abstract predicates with their client-visible reasoning principles; (ii) residual predicates, which help enforcing data abstraction in callback-rich code; and (iii) an application to pure (Smalltalk-style) objects that connects code verification to model-level reasoning about features such as subtyping, self, inheritance, and late binding. We introduce our techniques using concrete example modules that have all been verified using the Coq proof assistant and combine to fully linked verified programs using a novel, abstraction-respecting component composition rule for Verifiable C.more » « less
-
Abstract Memory safety is a fundamental correctness property of software. For programs that manipulate linked, heap-allocated data structures, ensuring memory safety requires analyzing their possible shapes. Despite significant advances in shape analysis, existing techniques rely on hand-crafted domains tailored to specific data structures, making them difficult to generalize and extend. This paper presents a novel approach that reduces memory-safety proofs to the verification of heap-less imperative programs, enabling the use of off-the-shelf software verification tools. We achieve this reduction through two complementary program instrumentation techniques: space invariants, which enable symbolic reasoning about unbounded heaps, and flow abstraction, which encodes global heap properties as local flow equations. The approach effectively verifies memory safety across a broad range of programs, including concurrent lists and trees that lie beyond the reach of existing shape analysis tools.more » « less
-
Berg, Jeremias; Nordström, Jakob (Ed.)Knowledge compilers convert Boolean formulas, given in conjunctive normal form (CNF), into representations that enable efficient evaluation of unweighted and weighted model counts, as well as a variety of other useful properties. With projected knowledge compilation, the generated representation describes the restriction of the formula to a designated set of data variables, with the remaining ones eliminated by existential quantification. Projected knowledge compilation has applications in a variety of domains, including formal verification and synthesis. This paper describes a formally verified proof framework for certifying the output of a projected knowledge compiler. It builds on an earlier clausal proof framework for certifying the output of a standard knowledge compiler. Extending the framework to projected compilation requires a method to represent Skolem assignments, describing how the quantified variables can be assigned, given an assignment for the data variables. We do so by extending the representation generated by the knowledge compiler to also encode Skolem assignments. We also refine the earlier framework, moving beyond purely clausal proofs to enable scaling certification to larger formulas. We present experimental results obtained by making small modifications to the D4 projected knowledge compiler and extensions of our earlier proof generator. We detail a soundness argument stating that a compiler output that passes our certifier is logically equivalent to the quantified input formula; the soundness argument has been formally validated using the HOL4 proof assistant. The checker also ensures that the compiler output satisfies the properties required for efficient unweighted and weighted model counting. We have developed two proof checkers for the certification framework: one written in C and designed for high performance and one written in CakeML and formally verified in HOL4.more » « less
An official website of the United States government
