skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: What If We Don't Pop the Stack? The Return of 2nd-Class Values
Using a stack for managing the local state of procedures as popularized by Algol is a simple but effective way to achieve a primitive form of automatic memory management. Hence, the call stack remains the backbone of most programming language runtimes to the present day. However, the appealing simplicity of the call stack model comes at the price of strictly enforced limitations: since every function return pops the stack, it is difficult to return stack-allocated data from a callee upwards to its caller – especially variable-size data such as closures. This paper proposes a solution by introducing a small tweak to the usual stack semantics. We design a type system that tracks the underlying storage mode of values, and when a function returns a stack-allocated value, we just don’t pop the stack! Instead, the stack frame is de-allocated together with a parent the next time a heap-allocated value or primitive is returned. We identify a range of use cases where this delayed-popping strategy is beneficial, ranging from closures to trait objects to other types of variable-size data. Our evaluation shows that this execution model reduces heap and GC pressure and recovers spatial locality of programs improving execution time between 10% and 25% with respect to standard execution.  more » « less
Award ID(s):
1910216
PAR ID:
10479688
Author(s) / Creator(s):
; ; ;
Editor(s):
Karim Ali and Jan Vitek
Publisher / Repository:
Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
Date Published:
Journal Name:
36th European Conference on Object-Oriented Programming (ECOOP 2022)
Format(s):
Medium: X
Location:
Berlin, Germany
Sponsoring Org:
National Science Foundation
More Like this
  1. Motivated by the significantly higher cost of writing than reading in emerging memory technologies, we consider parallel algorithm design under such asymmetric read-write costs, with the goal of reducing the number of writes while preserving work-efficiency and low span. We present a nested-parallel model of computation that combines (i) small per-task stack-allocated memories with symmetric read-write costs and (ii) an unbounded heap-allocated shared memory with asymmetric read-write costs, and show how the costs in the model map efficiently onto a more concrete machine model under a work-stealing scheduler. We use the new model to design reduced write, work-efficient, low span parallel algorithms for a number of fundamental problems such as reduce, list contraction, tree contraction, breadth-first search, ordered filter, and planar convex hull. For the latter two problems, our algorithms are output-sensitive in that the work and number of writes decrease with the output size. We also present a reduced write, low span minimum spanning tree algorithm that is nearly work-efficient (off by the inverse Ackermann function). Our algorithms reveal several interesting techniques for significantly reducing shared memory writes in parallel algorithms without asymptotically increasing the number of shared memory reads. 
    more » « less
  2. null (Ed.)
    The efficient implementation of function calls and non-local control transfers is a critical part of modern language implementations and is important in the implementation of everything from recursion, higher-order functions, concurrency and coroutines, to task-based parallelism. In a compiler, these features can be supported by a variety of mechanisms, including call stacks, segmented stacks, and heap-allocated continuation closures. An implementor of a high-level language with advanced control features might ask the question "what is the best choice for my implementation?" Unfortunately, the current literature does not provide much guidance, since previous studies suffer from various flaws in methodology and are outdated for modern hardware. In the absence of recent, well-normalized measurements and a holistic overview of their implementation specifics, the path of least resistance when choosing a strategy is to trust folklore, but the folklore is also suspect. This paper attempts to remedy this situation by providing an "apples-to-apples" comparison of six different approaches to implementing call stacks and continuations. This comparison uses the same source language, compiler pipeline, LLVM-backend, and runtime system, with the only differences being those required by the differences in implementation strategy. We compare the implementation challenges of the different approaches, their sequential performance, and their suitability to support advanced control mechanisms, including supporting heavily threaded code. In addition to the comparison of implementation strategies, the paper's contributions also include a number of useful implementation techniques that we discovered along the way. 
    more » « less
  3. We present ASM Visualizer, a tool that is designed to help students learn assembly programming, aiding in their understanding of how assembly instructions are executed and the relationship between assembly and equivalent high-level language code. Our tool allows a user to step both forward and backward through the execution of an assembly program, one instruction at a time, seeing how instruc- tions use and modify values in stack memory and CPU registers. ASM Visualizer presents three user-interface modes, supporting different stages of learning assembly programming. Beginners can step through basic arithmetic instructions, whereas more advanced learners can trace through function call/return sequences, stack frame manipulation, or entire assembly programs. We present our experiences using ASM Visualizer in introductory-level courses at our two institutions, and we discuss other ways in which our tool could be used by educators in both introductory and advanced CS courses. Results from a preliminary assessment of students using our tool show that students gain confidence in their understanding of different aspects of assembly programming. We feel that the visual interface to assembly code execution that ASM Visualizer provides is key to helping students understand assembly. 
    more » « less
  4. A key ingredient contributing to the success of CompCert, the state-of-the-art verified compiler for C, is its block-based memory model, which is used uniformly for all of its languages and their verified compilation. However, CompCert's memory model lacks an explicit notion of stack. Its target assembly language represents the runtime stack as an unbounded list of memory blocks, making further compilation of CompCert assembly into more realistic machine code difficult since it is not possible to merge these blocks into a finite and continuous stack. Furthermore, various notions of verified compositional compilation rely on some kind of mechanism for protecting private stack data and enabling modification to the public stack-allocated data, which is lacking in the original CompCert. These problems have been investigated but not fully addressed before, in the sense that some advanced optimization passes that significantly change the ways stack blocks are (de-)allocated, such as tailcall recognition and inlining, are often omitted. We propose a lightweight and complete solution to the above problems. It is based on the enrichment of CompCert's memory model with an abstract stack that keeps track of the history of stack frames to bound the stack consumption and that enforces a uniform stack access policy by assigning fine-grained permissions to stack memory. Using this enriched memory model for all the languages of CompCert, we are able to reprove the correctness of the full compilation chain of CompCert, including all the optimization passes. In the end, we get Stack-Aware CompCert, a complete extension of CompCert that enforces the finiteness of the stack and fine-grained stack permissions. Based on Stack-Aware CompCert, we develop CompCertMC, the first extension of CompCert that compiles into a low-level language with flat memory spaces. Based on CompCertMC, we develop Stack-Aware CompCertX, a complete extension of CompCert that supports a notion of compositional compilation that we call contextual compilation by exploiting the uniform stack access policy provided by the abstract stack. 
    more » « less
  5. Scratchpad memories (SPMs) provide a time-predictable alternative to caches but requires explicit management in the code. When the call stack is stored in the SPM, stack frames need to be evicted to and restored from the main memory to avoid stack overflow. We propose a technique to find optimal locations in the code to perform stack management operations such that the worst-case execution time (WCET) is minimized. 
    more » « less