NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FireAxe: Partitioned FPGA-Accelerated Simulation of Large-Scale RTL Designs

Whangbo, Joonho; Lim, Edwin; Zhang, Chengyi Lux; Anderson, Kevin; Gonzalez, Abraham; Gupta, Raghav; Krishnakumar, Nivedha; Karandikar, Sagar; Nikolić, Borivoje; Shao, Yakun Sophia; et al (July 2024, ACM/IEEE International Symposium on Computer Architecture (ISCA 2024))

Full Text Available
CDPU: Co-designing Compression and Decompression Processing Units for Hyperscale Systems

https://doi.org/10.1145/3579371.3589074

Karandikar, Sagar; Udipi, Aniruddha N.; Choi, Junsun; Whangbo, Joonho; Zhao, Jerry; Kanev, Svilen; Lim, Edwin; Alakuijala, Jyrki; Madduri, Vrishab; Shao, Yakun Sophia; et al (June 2023, ISCA '23: Proceedings of the 50th Annual International Symposium on Computer Architecture)

Full Text Available
Profiling Hyperscale Big Data Processing

https://doi.org/10.1145/3579371.3589082

Gonzalez, Abraham; Kolli, Aasheesh; Khan, Samira; Liu, Sihang; Dadu, Vidushi; Karandikar, Sagar; Chang, Jichuan; Asanovic, Krste; Ranganathan, Parthasarathy (January 2023, Proceedings of the 50th Annual International Symposium on Computer Architecture)

Full Text Available
A Hardware Accelerator for Protocol Buffers

https://doi.org/10.1145/3466752.3480051

Karandikar, Sagar; Leary, Chris; Kennelly, Chris; Zhao, Jerry; Parimi, Dinesh; Nikolic, Borivoje; Asanovic, Krste; Ranganathan, Parthasarathy (October 2021, 54th Annual IEEE/ACM International Symposium on Microarchitecture)

Full Text Available
COBRA: A Framework for Evaluating Compositions of Hardware Branch Predictors

https://doi.org/10.1109/ISPASS51385.2021.00053

Zhao, Jerry; Gonzalez, Abraham; Amid, Alon; Karandikar, Sagar; Asanovic, Krste (March 2021, 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS))
null (Ed.)
We present COBRA, a framework which enables a realistic hardware-guided methodology for evaluating compositions of hardware branch predictors. COBRA provides a common interface for developing RTL implementations of predictor subcomponents, as well as a predictor composer that automatically generates hardware predictor pipelines from sub-components based on a high-level topological model of a desired algorithm. We demonstrate how COBRA aids in the design and evaluation of diverse predictor architectures and how our hardware-centric approach captures concerns in predictor characterization that are not exposed in software-based algorithm development. Using COBRA, we generate three superscalar pipelined branch predictors with diverse architectures, synthesize them to run at 1 GHz on a commercial FinFET process, integrate them with the open-source BOOM out-of-order core, and evaluate their endto- end performance on workloads over trillions of cycles. The COBRA generator system has been open-sourced as part of the SonicBOOM out-of-order core.
more » « less
Full Text Available
FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud

https://doi.org/10.1109/ISCA.2018.00014

Karandikar, Sagar; Mao, Howard; Kim, Donggyu; Biancolin, David; Amid, Alon; Lee, Dayeol; Pemberton, Nathan; Amaro, Emmanuel; Schmidt, Colin; Chopra, Aditya; et al (June 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA))

We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated simulation of silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior FPGA-accelerated simulation tools, FireSim runs on Amazon EC2 F1, a public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost of large-scale FPGA-based experiments. We describe the design and implementation of FireSim and show how it can provide sufficient performance to run modern applications at scale, to enable true hardware-software co-design. As an example, we demonstrate automatically generating and deploying a target cluster of 1,024 3.2 GHz quad-core server nodes, each with 16 GB of DRAM, interconnected by a 200 Gbit/s network with 2 microsecond latency, which simulates at a 3.4 MHz processor clock rate (less than 1,000x slowdown over real-time). In aggregate, this FireSim instantiation simulates 4,096 cores and 16 TB of memory, runs ~ 14 billion instructions per second, and harnesses 12.8 million dollars worth of FPGAs-at a total cost of only ~ $100 per simulation hour to the user. We present several examples to show how FireSim can be used to explore various research directions in warehouse-scale machine design, including modeling networks with high-bandwidth and low-latency, integrating arbitrary RTL designs for a variety of commodity and specialized datacenter nodes, and modeling a variety of datacenter organizations, as well as reusing the scale-out FireSim infrastructure to enable fast, massively parallel cycle-exact single-node microarchitectural experimentation.
more » « less
Full Text Available

Search for: All records