Transformer networks have achieved remarkable success in Natural Language Processing (NLP) and Computer Vision applications. However, the underlying large volumes of Transformer computations demand high reliability and resilience to soft errors in processor hardware. The objective of this research is to develop efficient techniques for design of error resilient Transformer architectures. To enable this, we first perform a soft error vulnerability analysis of every fully connected layers in Transformer computations. Based on this study, error detection and suppression modules are selectively introduced into datapaths to restore Transformer performance under anticipated error rate conditions. Memory access errors and neuron output errors are detected using checksums of linear Transformer computations. Correction consists of determining output neurons with out-of-range values and suppressing the same to zero. For a Transformer with nominal BLEU score of 52.7, such vulnerability guided selective error suppression can recover language translation performance from a BLEU score of 0 to 50.774 with as much as 0.001 probability of activation error, incurring negligible memory and computation overheads.
more »
« less
Error Resilient Neuromorphic Systems Using Embedded Predictive Neuron Checks
The reliability of emerging neuromorphic compute fabrics is of great concern due to their widespread use in critical data-intensive applications. Ensuring such reliability is difficult due to the intensity of underlying computations (billions of parameters), errors induced by low power operation and the complex relationship between errors in computations and their effect on network performance accuracy. We study the problem of designing error-resilient neuromorphic systems where errors can stem from: (a) soft errors in computation of matrix-vector multiplications and neuron activations, (b) malicious trojan and adversarial security attacks and (c) effects of manufacturing process variations on analog crossbar arrays that can affect DNN accuracy. The core principle of error detection relies on embedded
predictive neuron checks using invariants derived from the statistics of nominal neuron activation patterns of hidden layers of a neural network. Algorithmic encodings of hidden neuron function are also used to derive invariants for checking. A key contribution is designing checks that are robust to the inherent nonlinearity of neuron computations with minimal impact on error detection coverage. Once errors are detected, they are corrected using probabilistic methods due to the difficulties involved in exact error diagnosis in such complex systems. The technique is scalable across soft errors as well as a range of security attacks. The effects of manufacturing process variations are handled through the use of compact tests from which DNN performance can be assessed using learning techniques. Experimental results on a variety of neuromorphic test systems: DNNs, spiking networks and hyperdimensional computing are presented.
more »
« less
- Award ID(s):
- 2128419
- PAR ID:
- 10453092
- Date Published:
- Journal Name:
- Latin American Test Symposium
- Page Range / eLocation ID:
- 1-2
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Neuromorphic computing systems execute machine learning tasks designed with spiking neural networks. These systems are embracing non-volatile memory to implement high-density and low-energy synaptic storage. Elevated voltages and currents needed to operate non-volatile memories cause aging of CMOS-based transistors in each neuron and synapse circuit in the hardware, drifting the transistor’s parameters from their nominal values. If these circuits are used continuously for too long, the parameter drifts cannot be reversed, resulting in permanent degradation of circuit performance over time, eventually leading to hardware faults. Aggressive device scaling increases power density and temperature, which further accelerates the aging, challenging the reliable operation of neuromorphic systems. Existing reliability-oriented techniques periodically de-stress all neuron and synapse circuits in the hardware at fixed intervals, assuming worst-case operating conditions, without actually tracking their aging at run-time. To de-stress these circuits, normal operation must be interrupted, which introduces latency in spike generation and propagation, impacting the inter-spike interval and hence, performance (e.g., accuracy). We observe that in contrast to long-term aging, which permanently damages the hardware, short-term aging in scaled CMOS transistors is mostly due to bias temperature instability. The latter is heavily workload-dependent and, more importantly, partially reversible. We propose a new architectural technique to mitigate the aging-related reliability problems in neuromorphic systems by designing an intelligent run-time manager (NCRTM), which dynamically de-stresses neuron and synapse circuits in response to the short-term aging in their CMOS transistors during the execution of machine learning workloads, with the objective of meeting a reliability target. NCRTM de-stresses these circuits only when it is absolutely necessary to do so, otherwise reducing the performance impact by scheduling de-stress operations off the critical path. We evaluate NCRTM with state-of-the-art machine learning workloads on a neuromorphic hardware. Our results demonstrate that NCRTM significantly improves the reliability of neuromorphic hardware, with marginal impact on performance.more » « less
-
In this research, a low cost error detection and correction approach is developed for multilayer perceptron networks, where checker neurons are used to encode hidden layer functions using independent training experiments. Error detection and correction is predicated on validating consistency properties of the encoded checks and shows that high coverage of injected errors can be achieved with extremely low computational overhead.more » « less
-
Spiking Neural Networks (SNN) are fast emerging as an alternative option to Deep Neural Networks (DNN). They are computationally more powerful and provide higher energy-efficiency than DNNs. While exciting at first glance, SNNs contain security-sensitive assets (e.g., neuron threshold voltage) and vulnerabilities (e.g., sensitivity of classification accuracy to neuron threshold voltage change) that can be exploited by the adversaries. We explore global fault injection attacks using external power supply and laser-induced local power glitches on SNN designed using common analog neurons to corrupt critical training parameters such as spike amplitude and neuron’s membrane threshold potential. We also analyze the impact of power-based attacks on the SNN for digit classification task and observe a worst-case classification accuracy degradation of −85.65%. We explore the impact of various design parameters of SNN (e.g., learning rate, spike trace decay constant, and number of neurons) and identify design choices for robust implementation of SNN. We recover classification accuracy degradation by 30–47% for a subset of power-based attacks by modifying SNN training parameters such as learning rate, trace decay constant, and neurons per layer. We also propose hardware-level defenses, e.g., a robust current driver design that is immune to power-oriented attacks, improved circuit sizing of neuron components to reduce/recover the adversarial accuracy degradation at the cost of negligible area, and 25% power overhead. We also propose a dummy neuron-based detection of voltage fault injection at ∼1% power and area overhead each.more » « less
-
The last decade has seen tremendous advances in the application of artificial neural networks to solving problems that mimic human intelligence. Many of these systems are implemented using traditional digital compute engines where errors can occur during memory accesses or during numerical computation. While such networks are inherently error resilient, specific errors can result in incorrect decisions. This work develops a low overhead error detection and correction approach for multilayer artificial neural networks, here the hidden layer functions are approximated using checker neurons. Experimental results show that a high coverage of injected errors can be achieved with extremely low computational overhead using consistency properties of the encoded checks. A key side benefit is that the checks can flag errors when the network is presented outlier data that do not correspond to data with which the network is trained to operate.more » « less