UPTPU: Improving Energy Efficiency of a Tensor Processing Unit through Underutilization Based Power-Gating

Pandey, Pramesh; Gundi, Noel Daniel; Chakraborty, Koushik; Roy, Sanghamitra

doi:10.1109/DAC18074.2021.9586224

Citation Details

UPTPU: Improving Energy Efficiency of a Tensor Processing Unit through Underutilization Based Power-Gating

The AI boom is bringing a plethora of domain-specific architectures for Neural Network computations. Google’s Tensor Processing Unit (TPU), a Deep Neural Network (DNN) accelerator, has replaced the CPUs/GPUs in its data centers, claiming more than 15X rate of inference. However, the unprecedented growth in DNN workloads with the widespread use of AI services projects an increasing energy consumption of TPU based data centers. In this work, we parametrize the extreme hardware underutilization in TPU systolic array and propose UPTPU: an intelligent, dataflow adaptive power-gating paradigm to provide a staggering 3.5X - 6.5X energy efficiency to TPU for different input batch sizes. more »

Award ID(s):: 2106237 1253024

NSF-PAR ID:: 10347896

Author(s) / Creator(s):: Pandey, Pramesh; Gundi, Noel Daniel; Chakraborty, Koushik; Roy, Sanghamitra

Date Published:: 2021-12-05

Journal Name:: 2021 58th ACM/IEEE Design Automation Conference (DAC)

Page Range / eLocation ID:: 325 to 330

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/DAC18074.2021.9586224

More Like this