STRIVE: Enabling Choke Point Detection and Timing Error Resilience in a Low-Power Tensor Processing Unit

Gundi, Noel Daniel; Mowri, Zinnia Muntaha; Chamberlin, Andrew; Roy, Sanghamitra; Chakraborty, Koushik

Citation Details

Rapid growth in Deep Neural Network (DNN) workloads has increased the energy footprint of the Artificial Intelligence (AI) computing realm. For optimum energy efficiency, we propose operating a DNN hardware in the Low-Power Computing (LPC) region. However, operating at LPC causes increased delay sensitivity to Process Variation (PV). Delay faults are an intriguing consequence of PV. In this paper, we demonstrate the vulnerability of DNNs to delay variations, substantially lowering the prediction accuracy. To overcome delay faults, we present STRIVE—a post-fabrication fault detection and reactive error reduction technique. We also introduce a time-borrow correction technique to ensure error-free DNN computation. more »

Award ID(s):: 2106237

PAR ID:: 10445571

Author(s) / Creator(s):: Gundi, Noel Daniel; Mowri, Zinnia Muntaha; Chamberlin, Andrew; Roy, Sanghamitra; Chakraborty, Koushik

Date Published:: 2023-07-01

Journal Name:: Proceedings ACM IEEE Design Automation Conference

ISSN:: 0738-100X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this