NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Unifying Static and Dynamic Intermediate Languages for Accelerator Generators

https://doi.org/10.1145/3689790

Kim, Caleb; Li, Pai; Mohan, Anshuman; Butt, Andrew; Sampson, Adrian; Nigam, Rachit (October 2024, Proceedings of the ACM on Programming Languages)

Compilers for accelerator design languages (ADLs) translate high-level languages into application-specific hardware. ADL compilers rely on a hardwarecontrol interfaceto compose hardware units. There are two choices:staticcontrol, which relies on cycle-level timing; ordynamiccontrol, which uses explicit signalling to avoid depending on timing details. Static control is efficient but brittle; dynamic control incurs hardware costs to support compositional reasoning. Piezo is an ADL compiler that unifies static and dynamic control in a single intermediate language (IL). Its key insight is that the IL’s static fragment is arefinementof its dynamic fragment: static code admits a subset of the run-time behaviors of the dynamic equivalent. Piezo can optimize code by combining facts from static and dynamic submodules, and it opportunistically converts code from dynamic to static control styles. We implement Piezo as an extension to an existing dynamic ADL compiler, Calyx. We use Piezo to implement a frontend for an existing ADL, a systolic array generator, and a packet-scheduling hardware generator to demonstrate its optimizations and the static–dynamic interactions it enables.
more » « less
Full Text Available
Modular Hardware Design with Timeline Types

https://doi.org/10.1145/3591234

Nigam, Rachit; Azevedo_de_Amorim, Pedro_Henrique; Sampson, Adrian (June 2023, Proceedings of the ACM on Programming Languages)

Modular design is a key challenge for enabling large-scale reuse of hardware modules. Unlike software, however, hardware designs correspond to physical circuits and inherit constraints from them. Timing constraints—which cycle a signal arrives, when an input is read—and structural constraints—how often a multiplier accepts new inputs—are fundamental to hardware interfaces. Existing hardware design languages do not provide a way to encode these constraints; a user must read documentation, build scripts, or in the worst case, a module’s implementation to understand how to use it. We present Filament, a language for modular hardware design that supports the specification and enforcement of timing and structural constraints for statically scheduled pipelines. Filament usestimeline types, which describe the intervals of clock-cycle time when a given signal is available or required. Filament enablessafe compositionof hardware modules, ensures that the resulting designs are correctly pipelined, and predictably lowers them to efficient hardware.
more » « less
Stepwise Debugging for Hardware Accelerators

https://doi.org/10.1145/3575693.3575717

Berlstein, Griffin; Nigam, Rachit; Gyurgyik, Christophe; Sampson, Adrian (January 2023, 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Full Text Available
Performance Left on the Table: An Evaluation of Compiler Autovectorization for RISC-V

https://doi.org/10.1109/MM.2022.3184867

Adit, Neil; Sampson, Adrian (September 2022, IEEE Micro)

Full Text Available
Compiler-Driven Simulation of Reconfigurable Hardware Accelerators

https://doi.org/10.1109/HPCA53966.2022.00052

Li, Zhijing; Ye, Yuwei; Neuendorffer, Stephen; Sampson, Adrian (April 2022, 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA))

As customized accelerator design has become increasingly popular to keep up with the demand for high performance computing, it poses challenges for modern simulator design to adapt to such a large variety of accelerators. Existing simulators tend to two extremes: low-level and general approaches, such as RTL simulation, that can model any hardware but require substantial effort and long execution times; and higher-level application-specific models that can be much faster and easier to use but require one-off engineering effort.This work proposes a compiler-driven simulation workflow that can model configurable hardware accelerator. The key idea is to separate structure representation from simulation by developing an intermediate language that can flexibly represent a wide variety of hardware constructs. We design the Event Queue (EQueue) dialect of MLIR, a dialect that can model arbitrary hardware accelerators with explicit data movement and distributed event-based control; we also implement a generic simulation engine to model EQueue programs with hybrid MLIR dialects representing different abstraction levels. We demonstrate two case studies of EQueue-implemented accelerators: the systolic array of convolution and SIMD processors in a modern FPGA. In the former we show EQueue simulation is as accurate as a state-of-the-art simulator, while offering higher extensibility and lower iteration cost via compiler passes. In the latter we demonstrate our simulation flow can guide designer efficiently improve their design using visualizable simulation outputs.
more » « less
Full Text Available
Software-Defined Vector Processing on Manycore Fabrics

https://doi.org/10.1145/3466752.3480099

Bedoukian, Philip; Adit, Neil; Peguero, Edwin; Sampson, Adrian (October 2021, MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture)

Full Text Available
A compiler infrastructure for accelerator generators

https://doi.org/10.1145/3445814.3446712

Nigam, Rachit; Thomas, Samuel; Li, Zhijing; Sampson, Adrian (April 2021, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021))
null (Ed.)
We present Calyx, a new intermediate language (IL) for compiling high-level programs into hardware designs. Calyx combines a hardware-like structural language with a software-like control flow representation with loops and conditionals. This split representation enables a new class of hardware-focused optimizations that require both structural and control flow information which are crucial for high-level programming models for hardware design. The Calyx compiler lowers control flow constructs using finite-state machines and generates synthesizable hardware descriptions. We have implemented Calyx in an optimizing compiler that translates high-level programs to hardware. We demonstrate Calyx using two DSL-to-RTL compilers, a systolic array generator and one for a recent imperative accelerator language, and compare them to equivalent designs generated using high-level synthesis (HLS). The systolic arrays are 4.6× faster and 1.11× larger on average than HLS implementations, and the HLS-like imperative language compiler is within a few factors of a highly optimized commercial HLS toolchain. We also describe three optimizations implemented in the Calyx compiler.
more » « less
Full Text Available
Vectorization for digital signal processors via equality saturation

https://doi.org/10.1145/3445814.3446707

VanHattum, Alexa; Nigam, Rachit; Lee, Vincent T.; Bornholt, James; Sampson, Adrian (April 2021, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021))
null (Ed.)
Applications targeting digital signal processors (DSPs) benefit from fast implementations of small linear algebra kernels. While existing auto-vectorizing compilers are effective at extracting performance from large kernels, they struggle to invent the complex data movements necessary to optimize small kernels. To get the best performance, DSP engineers must hand-write and tune specialized small kernels for a wide spectrum of applications and architectures. We present Diospyros, a search-based compiler that automatically finds efficient vectorizations and data layouts for small linear algebra kernels. Diospyros combines symbolic evaluation and equality saturation to vectorize computations with irregular structure. We show that a collection of Diospyros-compiled kernels outperform implementations from existing DSP libraries by 3.1× on average, that Diospyros can generate kernels that are competitive with expert-tuned code, and that optimizing these small kernels offers end-to-end speedup for a DSP application.
more » « less
Full Text Available
Geometry types for graphics programming

https://doi.org/10.1145/3428241

Geisler, Dietrich; Yoon, Irene; Kabra, Aditi; He, Horace; Sanders, Yinnon; Sampson, Adrian (November 2020, Proceedings of the ACM on Programming Languages)
null (Ed.)
Full Text Available
Predictable Accelerator Design with Time-Sensitive Affine Types

Nigam, R.; Atapattu, S.; Thomas, S.; Bauer, T.; Koti, A.; Li, Z.; Ye, Y.; Sampson, A.; Zhang, Z. (June 2020, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation)

Field-programmable gate arrays (FPGAs) provide an opportunity to co-design applications with hardware accelerators, yet they remain difficult to program. High-level synthesis (HLS) tools promise to raise the level of abstraction by compiling C or C++ to accelerator designs. Repurposing legacy software languages, however, requires complex heuristics to map imperative code onto hardware structures. We find that the black-box heuristics in HLS can be unpredictable: changing parameters in the program that should improve performance can counterintuitively yield slower and larger designs. This paper proposes a type system that restricts HLS to programs that can predictably compile to hardware accelerators. The key idea is to model consumable hardware resources with a time-sensitive affine type system that prevents simultaneous uses of the same hardware structure. We implement the type system in Dahlia, a language that compiles to HLS C++, and show that it can reduce the size of HLS parameter spaces while accepting Pareto-optimal designs.
more » « less
Full Text Available

« Prev Next »

Search for: All records