Rapid growth in Deep Neural Network (DNN) workloads has increased the energy footprint of the Artificial Intelligence (AI) computing realm. For optimum energy efficiency, we propose operating a DNN hardware in the Low-Power Computing (LPC) region. However, operating at LPC causes increased delay sensitivity to Process Variation (PV). Delay faults are an intriguing consequence of PV. In this article, we demonstrate the vulnerability of DNNs to delay variations, substantially lowering the prediction accuracy. To overcome delay faults, we present STRIVE—a post-fabrication fault detection and reactive error reduction technique. We also introduce a time-borrow correction technique to ensure error-free DNN computation. 
                        more » 
                        « less   
                    
                            
                            STRIVE: Enabling Choke Point Detection and Timing Error Resilience in a Low-Power Tensor Processing Unit
                        
                    
    
            Rapid growth in Deep Neural Network (DNN) workloads has increased the energy footprint of the Artificial Intelligence (AI) computing realm. For optimum energy efficiency, we propose operating a DNN hardware in the Low-Power Computing (LPC) region. However, operating at LPC causes increased delay sensitivity to Process Variation (PV). Delay faults are an intriguing consequence of PV. In this paper, we demonstrate the vulnerability of DNNs to delay variations, substantially lowering the prediction accuracy. To overcome delay faults, we present STRIVE—a post-fabrication fault detection and reactive error reduction technique. We also introduce a time-borrow correction technique to ensure error-free DNN computation. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2106237
- PAR ID:
- 10445571
- Date Published:
- Journal Name:
- Proceedings ACM IEEE Design Automation Conference
- ISSN:
- 0738-100X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Binary switches, which are the primitive units of all digital computing and information processing hardware, are usually benchmarked on the basis of their ‘energy–delay product’, which is the product of the energy dissipated in completing the switching action and the time it takes to complete that action. The lower the energy–delay product, the better the switch (supposedly). This approach ignores the fact that lower energy dissipation and faster switching usually come at the cost of poorer reliability (i.e., a higher switching error rate) and hence the energy–delay product alone cannot be a good metric for benchmarking switches. Here, we show the trade-off between energy dissipation, energy–delay product and error–probability for an electronic switch (a metal oxide semiconductor field effect transistor), a magnetic switch (a magnetic tunnel junction switched with spin transfer torque) and an optical switch (bistable non-linear mirror). As expected, reducing energy dissipation and/or energy–delay product generally results in increased switching error probability and reduced reliability.more » « less
- 
            Sparse deep neural networks (DNNs) have the potential to deliver compelling performance and energy efficiency without significant accuracy loss. However, their benefits can quickly diminish if their training is oblivious to the target hardware. For example, fewer critical connections can have a significant overhead if they translate into long-distance communication on the target hardware. Therefore, hardware-aware sparse training is needed to leverage the full potential of sparse DNNs. To this end, we propose a novel and comprehensive communication-aware sparse DNN optimization framework for tile-based in-memory computing (IMC) architectures. The proposed technique, CANNON first maps the DNN layers onto the tiles of the target architecture. Then, it replaces the fully connected and convolutional layers with communication-aware sparse connections. After that, CANNON optimizes the communication cost with minimal impact on the DNN accuracy. Extensive experimental evaluations with a wide range of DNNs and datasets show up to 3.0× lower communication energy, 3.1× lower communication latency, and 6.8× lower energy-delay product compared to state-of-the-art pruning approaches with a negligible impact on the classification accuracy on IMC-based machine learning accelerators.more » « less
- 
            Several cyber-physical systems use real-time restart-based embedded systems with the Simplex architecture to provide safety guarantees against system faults. Some approaches have been developed to protect such systems from security violations too, but none of these approaches can prevent an adversary from modifying the operating system or application code to execute an attack that persists even after a reboot. In this work, we present a secure boot mechanism to restore real-time restart-based embedded systems into a secure computing environment after every restart. We analyze the delay introduced by the proposed security feature and present preliminary results to demonstrate the viability of our approach using an open-source bootloader and real-time operating system.more » « less
- 
            Increasing processing requirements in the Artificial Intelligence (AI) realm has led to the emergence of domain-specific architectures for Deep Neural Network (DNN) applications. Tensor Processing Unit (TPU), a DNN accelerator by Google, has emerged as a front runner outclassing its contemporaries, CPUs and GPUs, in performance by 15×–30×. TPUs have been deployed in Google data centers to cater to the performance demands. However, a TPU’s performance enhancement is accompanied by a mammoth power consumption. In the pursuit of lowering the energy utilization, this paper proposes PREDITOR—a low-power TPU operating in the Near-Threshold Computing (NTC) realm. PREDITOR uses mathematical analysis to mitigate the undetectable timing errors by boosting the voltage of the selective multiplier-and-accumulator units at specific intervals to enhance the performance of the NTC TPU, thereby ensuring a high inference accuracy at low voltage. PREDITOR offers up to 3×–5× improved performance in comparison to the leading-edge error mitigation schemes with a minor loss in accuracy.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    