skip to main content

Title: From folklore to fact: comparing implementations of stacks and continuations
The efficient implementation of function calls and non-local control transfers is a critical part of modern language implementations and is important in the implementation of everything from recursion, higher-order functions, concurrency and coroutines, to task-based parallelism. In a compiler, these features can be supported by a variety of mechanisms, including call stacks, segmented stacks, and heap-allocated continuation closures. An implementor of a high-level language with advanced control features might ask the question "what is the best choice for my implementation?" Unfortunately, the current literature does not provide much guidance, since previous studies suffer from various flaws in methodology and are outdated for modern hardware. In the absence of recent, well-normalized measurements and a holistic overview of their implementation specifics, the path of least resistance when choosing a strategy is to trust folklore, but the folklore is also suspect. This paper attempts to remedy this situation by providing an "apples-to-apples" comparison of six different approaches to implementing call stacks and continuations. This comparison uses the same source language, compiler pipeline, LLVM-backend, and runtime system, with the only differences being those required by the differences in implementation strategy. We compare the implementation challenges of the different approaches, their sequential performance, and their suitability to support more » advanced control mechanisms, including supporting heavily threaded code. In addition to the comparison of implementation strategies, the paper's contributions also include a number of useful implementation techniques that we discovered along the way. « less
Authors:
;
Award ID(s):
1718540
Publication Date:
NSF-PAR ID:
10201136
Journal Name:
ACM SIGPLAN International Conference on Programming Language Design and Implementation
Page Range or eLocation-ID:
75 to 90
Sponsoring Org:
National Science Foundation
More Like this
  1. In today's mobile-first, cloud-enabled world, where simulation-enabled training is designed for use anywhere and from multiple different types of devices, new paradigms are needed to control access to sensitive data. Large, distributed data sets sourced from a wide-variety of sensors require advanced approaches to authorizations and access control (AC). Motivated by large-scale, publicized data breaches and data privacy laws, data protection policies and fine-grained AC mechanisms are an imperative in data intensive simulation systems. Although the public may suffer security incident fatigue, there are significant impacts to corporations and government organizations in the form of settlement fees and senior executive dismissal. This paper presents an analysis of the challenges to controlling access to big data sets. Implementation guidelines are provided based upon new attribute-based access control (ABAC) standards. Best practices start with AC for the security of large data sets processed by models and simulations (M&S). Currently widely supported eXtensible Access Control Markup Language (XACML) is the predominant framework for big data ABAC. The more recently developed Next Generation Access Control (NGAC) standard addresses additional areas in securing distributed, multi-owner big data sets. We present a comparison and evaluation of standards and technologies for different simulation data protection requirements. Amore »concrete example is included to illustrate the differences. The example scenario is based upon synthetically generated very sensitive health care data combined with less sensitive data. This model data set is accessed by representative groups with a range of trust from highly-trusted roles to general users. The AC security challenges and approaches to mitigate risk are discussed.« less
  2. As customized accelerator design has become increasingly popular to keep up with the demand for high performance computing, it poses challenges for modern simulator design to adapt to such a large variety of accelerators. Existing simulators tend to two extremes: low-level and general approaches, such as RTL simulation, that can model any hardware but require substantial effort and long execution times; and higher-level application-specific models that can be much faster and easier to use but require one-off engineering effort.This work proposes a compiler-driven simulation workflow that can model configurable hardware accelerator. The key idea is to separate structure representation from simulation by developing an intermediate language that can flexibly represent a wide variety of hardware constructs. We design the Event Queue (EQueue) dialect of MLIR, a dialect that can model arbitrary hardware accelerators with explicit data movement and distributed event-based control; we also implement a generic simulation engine to model EQueue programs with hybrid MLIR dialects representing different abstraction levels. We demonstrate two case studies of EQueue-implemented accelerators: the systolic array of convolution and SIMD processors in a modern FPGA. In the former we show EQueue simulation is as accurate as a state-of-the-art simulator, while offering higher extensibility and lowermore »iteration cost via compiler passes. In the latter we demonstrate our simulation flow can guide designer efficiently improve their design using visualizable simulation outputs.« less
  3. Conventional lithium-ion batteries are unable to meet the increasing demands for high-energy storage systems, because of their limited theoretical capacity. 1 In recent years, intensive attention has been paid to enhancing battery energy storage capability to satisfy the increasing energy demand in modern society and reduce the average energy capacity cost. Among the candidates for next generation high energy storage systems, the lithium sulfur battery is especially attractive because of its high theoretical specific energy (around 2600 W h kg-1) and potential cost reduction. In addition, sulfur is a cost effective and environmentally friendly material due to its abundance and low-toxicity. 2 Despite all of these advantages, the practical application of lithium sulfur batteries to date has been hindered by a series of obstacles, including low active material loading, poor cycle life, and sluggish sulfur conversion kinetics. 3 Achieving high mass loading cathode in the traditional 2D planar thick electrode has been challenged. The high distorsion of the traditional planar thick electrodes for ion/electron transfer leads to the limited utilization of active materials and high resistance, which eventually results in restricted energy density and accelerated electrode failure. 4 Furthermore, of the electrolyte to pores in the cathode and utilization ratiomore »of active materials. Catalysts such as MnO 2 and Co dopants were employed to accelerate the sulfur conversion reaction during the charge and discharge process. 5 However, catalysts based on transition metals suffer from poor electronic conductivity. Other catalysts such as transition metal dopants are also limited due to the increased process complexities. . In addition, the severe shuttle effects in Li-S batteries may lead to fast failures of the battery. Constructing a protection layer on the separator for limiting the transmission of soluble polysulfides is considered an effective way to eliminate the shuttle phenomenon. However, the soluble sulfides still can largely dissolve around the cathode side causing the sluggish reaction condition for sulfur conversion. 5 To mitigate the issues above, herein we demonstrate a novel sulfur electrode design strategy enabled by additive manufacturing and oxidative vapor deposition (oCVD). Specifically, the electrode is strategically designed into a hierarchal hollow structure via stereolithography technique to increase sulfur usage. The active material concentration loaded to the battery cathode is controlled precisely during 3D printing by adjusting the number of printed layers. Owing to its freedom in geometry and structure, the suggested design is expected to improve the Li ions and electron transport rate considerably, and hence, the battery power density. The printed cathode is sintered at 700 °C at N 2 atmosphere to achieve carbonization of the cathode during which intrinsic carbon defects (e.g., pentagon carbon) as catalytic defect sites are in-situ generated on the cathode. The intrinsic carbon defects equipped with adequate electronic conductivity. The sintered 3D cathode is then transferred to the oCVD chamber for depositing a thin PEDOT layer as a protection layer to restrict dissolutions of sulfur compounds in the cathode. Density functional theory calculation reveals the electronic state variance between the structures with and without defects, the structure with defects demonstrates the higher kinetic condition for sulfur conversion. To further identify the favorable reaction dynamic process, the in-situ XRD is used to characterize the transformation between soluble and insoluble polysulfides, which is the main barrier in the charge and discharge process of Li-S batteries. The results show the oCVD coated 3D printed sulfur cathode exhibits a much higher kinetic process for sulfur conversion, which benefits from the highly tailored hierarchal hollow structure and the defects engineering on the cathode. Further, the oCVD coated 3D printed sulfur cathode also demonstrates higher stability during long cycling enabled by the oCVD PEDOT protection layer, which is verified by an absorption energy calculation of polysulfides at PEDOT. Such modeling and analysis help to elucidate the fundamental mechanisms that govern cathode performance and degradation in Li-S batteries. The current study also provides design strategies for the sulfur cathode as well as selection approaches to novel battery systems. References: Bhargav, A., (2020). Lithium-Sulfur Batteries: Attaining the Critical Metrics. Joule 4 , 285-291. Chung, S.-H., (2018). Progress on the Critical Parameters for Lithium–Sulfur Batteries to be Practically Viable. Advanced Functional Materials 28 , 1801188. Peng, H.-J.,(2017). Review on High-Loading and High-Energy Lithium–Sulfur Batteries. Advanced Energy Materials 7 , 1700260. Chu, T., (2021). 3D printing‐enabled advanced electrode architecture design. Carbon Energy 3 , 424-439. Shi, Z., (2021). Defect Engineering for Expediting Li–S Chemistry: Strategies, Mechanisms, and Perspectives. Advanced Energy Materials 11 . Figure 1« less
  4. Abstract

    Quantum computing is a rapidly growing field with the potential to change how we solve previously intractable problems. Emerging hardware is approaching a complexity that requires increasingly sophisticated programming and control. Scaffold is an older quantum programming language that was originally designed for resource estimation for far-future, large quantum machines, and ScaffCC is the corresponding LLVM-based compiler. For the first time, we provide a full and complete overview of the language itself, the compiler as well as its pass structure. While previous works Abhariet al(2015Parallel Comput.452–17), Abhariet al(2012 Scaffold: quantum programming languagehttps://cs.princeton.edu/research/techreps/TR-934-12), have piecemeal descriptions of different portions of this toolchain, we provide a more full and complete description in this paper. We also introduce updates to ScaffCC including conditional measurement and multidimensional qubit arrays designed to keep in step with modern quantum assembly languages, as well as an alternate toolchain targeted at maintaining correctness and low resource count for noisy-intermediate scale quantum (NISQ) machines, and compatibility with current versions of LLVM and Clang. Our goal is to provide the research community with a functional LLVM framework for quantum program analysis, optimization, and generation of executable code.

  5. Code optimization is an intricate task that is getting more complex as computing systems evolve. Managing the program optimization process, including the implementation and evaluation of code variants, is tedious, inefficient, and errors are likely to be introduced in the process. Moreover, because each platform typically requires a different sequence of transformations to fully harness its computing power, the optimization process complexity grows as new platforms are adopted. To address these issues, systems and frameworks have been proposed to automate the code optimization process. They, however, have not been widely adopted and are primarily used by experts with deep knowledge about underlying architecture and compiler intricacies. This article describes the requirements that we believe necessary for making automatic performance tuning more broadly used, especially in complex, long-lived high-performance computing applications. Besides discussing limitations of current systems and strategies to overcome these, we describe the design of a system that is able to semi-automatically generate efficient platform-specific code. In the proposed system, the code optimization is programmer-guided, separately from application code, on an external file in what we call optimization programming. The language to program the optimization process is able to represent complex collections of transformations and, as a result, generatemore »efficient platform-specific code. A database manages different optimized versions of code regions, providing a pragmatic approach to performance portability, and the framework itself has separate components, allowing the optimized code to be used on systems without installing all of the modules required for the code generation. We present experiments on two different platforms to illustrate the generation of efficient platform-specific code that performs comparable to hand-optimized, vendor-provided code.« less