skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Built-In Functional Testing of Analog In-Memory Accelerators for Deep Neural Networks
The paper develops a methodology for the online built-in self-testing of deep neural network (DNN) accelerators to validate the correct operation with respect to their functional specifications. The DNN of interest is realized in the hardware to perform in-memory computing using non-volatile memory cells as computational units. Assuming a functional fault model, we develop methods to generate pseudorandom and structured test patterns to detect hardware faults. We also develop a test-sequencing strategy that combines these different classes of tests to achieve high fault coverage. The testing methodology is applied to a broad class of DNNs trained to classify images from the MNIST, Fashion-MNIST, and CIFAR-10 datasets. The goal is to expose hardware faults which may lead to the incorrect classification of images. We achieve an average fault coverage of 94% for these different architectures, some of which are large and complex.  more » « less
Award ID(s):
2008167
PAR ID:
10358685
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Electronics
Volume:
11
Issue:
16
ISSN:
2079-9292
Page Range / eLocation ID:
2592
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Functional broadside tests were developed to avoid overtesting of delay faults. The tests achieve this goal by creating functional operation conditions during their functional capture cycles. To increase the achievable fault coverage, close-to-functional scan-based tests are allowed to deviate from functional operation conditions. This article suggests that a more comprehensive functional broadside test set can be obtained by replacing target faults that cannot be detected with faults that have similar (but not identical) detection conditions. A more comprehensive functional broadside test set has the advantage that it still maintains functional operation conditions. It covers the test holes created when target faults cannot be detected by detecting similar faults. The article considers the case where the target faults are transition faults. When a standard transition fault, with an extra delay of a single clock cycle, cannot be detected, an unspecified transition fault is used instead. An unspecified transition fault captures the behaviors of transition faults with different extra delays. When this fault cannot be detected, a stuck-at fault is used instead. A stuck-at fault has some of the detection conditions of a transition fault. Multicycle functional broadside tests are used to allow unspecified transition faults to be detected. As a by-product, test compaction also occurs. The structure of the test generation procedure accommodates the complexity of producing functional broadside tests by considering the target as well as replacement faults together. Experimental results for benchmark circuits demonstrate the fault coverage improvements achieved, and the effect on the number of tests. 
    more » « less
  2. Vision Transformers (ViTs) have evolved in the field of computer vision by transitioning traditional Convolutional Neural Networks (CNNs) into attention-based architectures. This architecture processes input images as sequences of patches. ViTs achieve enhanced performance in many tasks such as image classification and object detection due to their ability to capture global dependencies within input data. While their software implementations are widely adopted, deploying ViTs on hardware introduces several challenges. These include fault tolerance in the presence of hardware failures, real-time reliability, and high computational requirements. Permanent faults that are in processing elements, interconnections, or memory subsystems lead to incorrect computations and degrading system performance. This paper proposes a fault-tolerant hardware implementation of ViTs to overcome these challenges. This hardware implementation integrates real-time fault detection and recovery mechanisms. The architecture includes four primary units: patch embedding, encoder, decoder, and Multi Layer Perceptron (MLP) which are supported by fault-tolerant components such as lightweight recompute units, a centralized Built-In Self-Test (BIST), and a learning-based decision-making system using machine learning model 'decision tree'. These units are interconnected through a centralized global buffer for efficient data transfer, ensuring seamless operation even under fault conditions. 
    more » « less
  3. null (Ed.)
    Hardware accelerators are essential to the accommodation of ever-increasing Deep Neural Network (DNN) workloads on the resource-constrained embedded devices. While accelerators facilitate fast and energy-efficient DNN operations, their accuracy is threatened by faults in their on-chip and off-chip memories, where millions of DNN weights are held. The use of emerging Non-Volatile Memories (NVM) further exposes DNN accelerators to a non-negligible rate of permanent defects due to immature fabrication, limited endurance, and aging. To tolerate defects in NVM-based DNN accelerators, previous work either requires extra redundancy in hardware or performs defect-aware retraining, imposing significant overhead. In comparison, this paper proposes a set of algorithms that exploit the flexibility in setting the fault-free bits in weight memory to effectively approximate weight values, so as to mitigate defect-induced accuracy drop. These algorithms can be applied as a one-step solution when loading the weights to embedded devices. They only require trivial hardware support and impose negligible run-time overhead. Experiments on popular DNN models show that the proposed techniques successfully boost inference accuracy even in the face of elevated defect rates in the weight memory. 
    more » « less
  4. null (Ed.)
    The use of multicycle tests, with several functional capture cycles between scan operations, contributes significantly to the ability to compact a test set. Multicycle tests have the added benefit that they can contribute to the detection of defects with complex behaviors that are not detected by single-cycle or two-cycle tests. To ensure that this benefit is materialized when test compaction is applied to transition faults, this article suggests to incorporate into the test compaction procedure an additional fault model whose fault coverage increases when multicycle tests are used. To ensure that the computational complexity of test compaction is not increased by a fault model with a large number of faults, or faults with complex behaviors, the added fault model is required to have the same characteristics as the transition fault model. A type of transition fault called unspecified transition fault satisfies these requirements. The article describes a test compaction procedure for transition faults that incorporates unspecified transition faults, and presents experimental results for benchmark circuits to demonstrate the levels of test compaction and fault coverage that can be achieved. 
    more » « less
  5. While adequacy criteria offer an end-point for testing, they do not mandate how targets are covered. Branch Coverage may be attained through direct calls to methods, or through indirect calls between methods. Automated generation is biased towards the rapid gains offered by indirect coverage. Therefore, even with the same end-goal, humans and automation produce very different tests. Direct coverage may yield tests that are more understandable, and that detect faults missed by traditional approaches. However, the added burden for the generation framework may result in lower coverage and faults that emerge through method interactions may be missed. To compare the two approaches, we have generated test suites for both, judging efficacy against real faults. We have found that requiring direct coverage results in lower achieved coverage and likelihood of fault detection. However, both forms of Branch Coverage cover code and detect faults that the other does not. By isolating methods, Direct Branch Coverage is less constrained in the choice of input. However, traditional Branch Coverage is able to leverage method interactions to discover faults. Ultimately, both are situationally applicable within the context of a broader testing strategy. 
    more » « less