NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Demystifying Map Space Exploration for NPUs

https://doi.org/10.1109/IISWC55918.2022.00031

Kao, Sheng-Chun; Parashar, Angshuman; Tsai, Po-An; Krishna, Tushar (November 2022, Proceedings of the IEEE International Symposium on Workload Characterization)

Full Text Available
A Formalism of DNN Accelerator Flexibility

https://doi.org/10.1145/3530907

Kao, Sheng-Chun; Kwon, Hyoukjun; Pellauer, Michael; Parashar, Angshuman; Krishna, Tushar (June 2022, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

The high efficiency of domain-specific hardware accelerators for machine learning (ML) has come fromspecialization, with the trade-off of less configurability/ flexibility. There is growing interest in developingflexible ML accelerators to make them future-proof to the rapid evolution of Deep Neural Networks (DNNs). However, the notion of accelerator flexibility has always been used in an informal manner, restricting computer architects from conducting systematic apples-to-apples design-space exploration (DSE) across trillions of choices. In this work, we formally define accelerator flexibility and show how it can be integrated for DSE. % flows. Specifically, we capture DNN accelerator flexibility across four axes: %the map-space of DNN accelerator along four flexibility axes: tiling, ordering, parallelization, and array shape. We categorize existing accelerators into 16 classes based on their axes of flexibility support, and define a precise quantification of the degree of flexibility of an accelerator across each axis. We leverage these to develop a novel flexibility-aware DSE framework. %It respects the difference of accelerator flexibility classes and degree of flexibility support in different accelerators, creating unique map-spaces. %and forms a unique map space for exploration. % We demonstrate how this can be used to perform first-of-their-kind evaluations, including an isolation study to identify the individual impact of the flexibility axes. We demonstrate that adding flexibility features to a hypothetical DNN accelerator designed in 2014 improves runtime on future (i.e., present-day) DNNs by 11.8x geomean.
more » « less
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores

https://doi.org/10.1109/HPCA53966.2022.00065

Kao, Sheng-Chun; Krishna, Tushar (April 2022, 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA))

Full Text Available
Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators

https://doi.org/10.1145/3485137

Chatarasi, Prasanth; Kwon, Hyoukjun; Parashar, Angshuman; Pellauer, Michael; Krishna, Tushar; Sarkar, Vivek (March 2022, ACM Transactions on Architecture and Code Optimization)

A spatial accelerator’s efficiency depends heavily on both its mapper and cost models to generate optimized mappings for various operators of DNN models. However, existing cost models lack a formal boundary over their input programs (operators) for accurate and tractable cost analysis of the mappings, and this results in adaptability challenges to the cost models for new operators. We consider the recently introduced Maestro Data-Centric (MDC) notation and its analytical cost model to address this challenge because any mapping expressed in the notation is precisely analyzable using the MDC’s cost model. In this article, we characterize the set of input operators and their mappings expressed in the MDC notation by introducing a set of conformability rules . The outcome of these rules is that any loop nest that is perfectly nested with affine tensor subscripts and without conditionals is conformable to the MDC notation. A majority of the primitive operators in deep learning are such loop nests. In addition, our rules enable us to automatically translate a mapping expressed in the loop nest form to MDC notation and use the MDC’s cost model to guide upstream mappers. Our conformability rules over the input operators result in a structured mapping space of the operators, which enables us to introduce a mapper based on our decoupled off-chip/on-chip approach to accelerate mapping space exploration. Our mapper decomposes the original higher-dimensional mapping space of operators into two lower-dimensional off-chip and on-chip subspaces and then optimizes the off-chip subspace followed by the on-chip subspace. We implemented our overall approach in a tool called Marvel , and a benefit of our approach is that it applies to any operator conformable with the MDC notation. We evaluated Marvel over major DNN operators and compared it with past optimizers.
more » « less
Full Text Available
DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators

https://doi.org/10.23919/DATE54114.2022.9774568

Kao, Sheng-Chun; Pellauer, Michael; Parashar, Angshuman; Krishna, Tushar (March 2022, 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE))

Full Text Available
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators

https://doi.org/10.1109/PACT52795.2021.00010

Jeong, Geonhwa; Kestor, Gokcen; Chatarasi, Prasanth; Parashar, Angshuman; Tsai, Po-An; Rajamanickam, Sivasankaran; Gioiosa, Roberto; Krishna, Tushar (September 2021, 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT))

Full Text Available
Heterogeneous Dataflow Accelerators for Multi-DNN Workloads

https://doi.org/10.1109/HPCA51647.2021.00016

Kwon, Hyoukjun; Lai, Liangzhen; Pellauer, Michael; Krishna, Tushar; Chen, Yu-Hsin; Chandra, Vikas (February 2021, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA))

Full Text Available
Flexion: A Quantitative Metric for Flexibility in DNN Accelerators

https://doi.org/10.1109/LCA.2020.3044607

Kwon, Hyoukjun; Pellauer, Michael; Parashar, Angshuman; Krishna, Tushar (January 2021, IEEE Computer Architecture Letters)

Full Text Available
GAMMA: automating the HW mapping of DNN models on accelerators via genetic algorithm

https://doi.org/10.1145/3400302.3415639

Kao, Sheng-Chun; Krishna, Tushar (November 2020, Proceedings of the 39th International Conference on Computer-Aided Design)

Full Text Available
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning

https://doi.org/10.1109/MICRO50266.2020.00058

Kao, Sheng-Chun; Jeong, Geonhwa; Krishna, Tushar (October 2020, In Proc of 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct 2020)

Full Text Available

« Prev Next »

Search for: All records