Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Binary Hyperdimensional Computing primitives o‚er signi€cant energy and hardware bene€ts for power constrained edge and portable AI applications. However, these are vulnerable to timing and so… errors in the associa tive memory that stores high-dimensional data representa tions. ‡is is made worse under aggressive power conditions, such as from voltage overscaling, resulting in signi€cant performance loss. In this research, class hypervectors are represented in matrix form and the matrix columns are ordered in terms of criticality (with respect to classi€cation accuracy). Subsequently, cosine similarity is used to cluster the critical matrix columns into submatrices, one submatrix for each cluster. Traditional algorithmic checksums su‚er from aliasing due to their binary representations. To address this, we use checks based on permutations of the columns of the submatrix for error detection. For error correction, we introduce a novel majority-vote column reconstruction (MVCR) algorithm, where the erroneous submatrix columns are re placed with a column vector representing a majority vote across all the column vectors of the submatrix. In contrast, erroneous noncritical matrix columns are suppressed to zero. ‡eproposedapproach is validated on SRAM-based platforms under voltage scaling, achieving up to 4X improvement in error resilience over state-of-the-art methods with minimal overhead.more » « less
-
Compute-in-memory (CiM) based convolutional neural network (CNN) accelerators achieve low-power inference, utilizing memristive crossbar arrays for matrix multiplications. However, inherent conductance variations within the crossbar introduce computational errors. These errors propagate to the CNN output and cause image misclassification, leading to sub stantial accuracy degradation. This paper addresses the critical challenge of efficient and reliable post-manufacture testing for CiM-based CNN accelerators. We propose a novel test image sampling methodology, which iteratively applies sampled im ages from the CNN’s testing dataset using progressive random sampling (PRS) to a device under test (DUT) and estimates a confidence interval for the DUT accuracy. Based on the confidence interval and the acceptable accuracy threshold, the test labels a DUT as ”pass” or ”fail”. Furthermore, if we have access to an initial set of DUTs, we apply the images from the CNN’s testing dataset to these DUTs and leverage the DUT outputs to rank-order test images. We develop a sequential estimation test (SET) framework, where the images from the CNN’s testing dataset are sequentially applied according to a predetermined rank and the test terminates when a DUT can be confidently labeled as “pass” or “fail” based on the applied images. In each case, the number of applied test images adapts to the quality of the DUT. Experiments show that PRS and SET achieve 2.2× and 4.6× speedup compared to state-of-the-art test methodologies.more » « less
-
Language models (LM) have revolutionized natural language processing (NLP) tasks such as question answering and code generation. However, escalating energy costs have accelerated the use of analog compute-in-memory (CiM) based architectures for energy-efficient language model inference. A key barrier to widespread adoption of such architectures is posed by device-to-device and cycle-to-cycle conductance variations within analog crossbars of CiM based systems that degrade text gener ation quality. To ensure reliable operation of analog CiM-based LMs (also referred to as device under tests or DUTs), fast and accurate functional testing of CiM structures is necessary post manufacture. We first assess the impact of conductance variations within crossbars on a LM’s log-perplexity, an evaluation metric for the quality of text generation. To derive a functional test, we propose confidence driven compact testing of CiM-based LMs. In this framework, we down-select text sequences from a benchmarking dataset to create a compact sequence set. During testing, we iteratively apply sequences from the compact sequence set to the CiM based LM and estimate a confidence interval of the DUT’s log-perplexity. Depending on the log-perplexity confidence interval (LPCI) and a predefined performance (log perplexity) threshold, a DUT is classified as ”good” or ”bad”. The combination of confidence driven testing and sequence down selection achieves 14.8-272x speedup compared to exhaustive testing while maintaining 0% critical test escape for GPT-2.more » « less
-
Generative adversarial networks (GANs) are promising for a range of applications, including image translation and denoising, as well as synthetic data generation. These applications can be mapped to memristive crossbar arrays (MCAs) for ultra-high energy efficiency and portability. However, conductance variation within analog crossbars degrades the quality of the GAN outputs and necessitates robust post-manufacturing testing. We propose a two-stage adaptive test framework for compute-in-memory (CiM) based GANs, comprising an exhaustive test and a compact test. The exhaustive test measures the inception score of a device under test (DUT) by applying a large number of noise vectors, called the exhaustive noise set. To reduce test time, a compact test estimates the inception score of a DUT from a carefully chosen subset of these vectors, called the compact noise set. The compact noise set is determined by a binary mask optimized with a novel backpropagation-guided algorithm to minimize the difference between the estimated and true inception scores of the DUTs. Finally, to leverage both the accuracy of the exhaustive test and the speed of the compact test, the proposed adaptive test framework first applies the compact test to every DUT. Only the DUTs that yield low confidence in classifications are then subjected to the exhaustive test. Experiments show that this adaptive approach achieves less than 1% test escapes while offering up to 7.26× speedup compared to exhaustive test.more » « less
-
Stringent quality requirements for safety-critical applications drive the demand for “zero defects” in modern ICs. In this context, delay characterization of standard cells for resistive open defects is an increasing concern due to aggressive timing margins in digital circuits. The problem is made worse by the large number of open defect sites in standard cells, combined with a wide range of defect resistance values for each site. This incurs possible prohibitive costs for defect simulation and characterization. To alleviate this complexity, we propose Resistive Fault Dominance (RFD) for resistive open defects. RFD eliminates simulations of certain open defects with intermediate defect resistance values that are guaranteed to exceed specified timing margins for standard cells, based on tests for specific “dominant” open defects. This can significantly reduce the computational costs of cell library characterization and simulation effort by 84%-91%. An algorithmic fault simulation methodology for resistive open defects on parasitic-extracted (PEX) transistor level netlists is developed.more » « less
-
Emerging brain-inspired hyperdimensional computing (HDC) algorithms are vulnerable to timing and soft errors in associative memory used to store high-dimensional data representations. Such errors can significantly degrade HDC performance. A key challenge is error correction after an error in computation is detected. This work presents two novel error resilience frameworks for hyperdimensional computing systems. The first, called the checksum hypervector encoding (CHE) framework, relies on creation of a single additional hypervector that is a checksum of all the class hypervectors of the HDC system. For error resilience, elementwise validation of the checksum property is performed and those elements across all class vectors for which the property fails are removed from consideration. For an HDC system with K class hypervectors of dimension D, the second cross-hypervector clustering (CHC) framework clusters D, K-dimensional vectors consisting of the i-th element of each of the K HDC class hypervectors, 1 ≤ i ≤ K. Statistical properties of these vector clusters are checked prior to each hypervector query and all the elements of all K-dimensional vectors corresponding to statistical outlier vectors are removed as before. The choice of which framework to use is dictated by the complexity of the dataset to classify. Up to three orders of magnitude better resilience to errors than the state-of-the-art across multiple HDC high-dimensional encoding (representation) systems is demonstrated.more » « less
-
Time-to-first-spike (TTFS) encoded spiking neural networks (SNNs), implemented using memristive crossbar arrays (MCA), achieve higher inference speed and energy efficiency compared to artificial neural networks (ANNs) and rate encoded SNNs. However, memristive crossbar arrays are vulnerable to conductance variations in the embedded memristor cells. These degrade the performance of TTFS encoded SNNs, namely their classification accuracy, with adverse impact on the yield of manufactured chips. To combat this yield loss, we propose a postmanufacture testing and tuning framework for these SNNs. In the testing phase, a timing encoded signature of the SNN, which is statistically correlated to the SNN performace, is extracted. In the tuning phase, this signature is mapped to optimal values of the tuning knobs (gain parameters), one parameter per layer, using a trained regressor, allowing very fast tuning (about 150ms). To further reduce the tuning overhead, we rank order hidden layer neurons based on their criticality and show that adding gain programmability only to 50% of the neurons is sufficient for performance recovery. Experiments show that the proposed framework can improve yield by up to 34% and average accuracy of memristive SNNs by up to 9%.more » « less
-
IEEE (Ed.)Resistive random access Memory (RRAM) based spiking neural networks (SNN) are becoming increasingly attractive for pervasive energy-efficient classification tasks. However, such networks suffer from degradation of performance (as determined by classification accuracy) due to the effects of process variations on fabricated RRAM devices resulting in loss of manufacturing yield. To address such yield loss, a two-step approach is developed. First, an alternative test framework is used to predict the performance of fabricated RRAM based SNNs using the SNN response to a small subset of images from the test image dataset, called the SNN response signature (to minimize test cost). This diagnoses those SNNs that need to be performance-tuned for yield recovery. Next, SNN tuning is performed by modulating the spiking thresholds of the SNN neurons on a layer-by-layer basis using a trained regressor that maps the SNN response signature to the optimal spiking thresholdvalues during tuning. The optimal spiking threshold values are determined by an off-line optimization algorithm. Experiments show that the proposed framework can reduce the number of out-of-spec SNN devices by up to 54% and improve yield by as much as 8.6%.more » « less
-
While resistive random access memory (RRAM) based deep neural networks (DNN) are important for low-power inference in IoT and edge applications, they are vulnerable to the effects of manufacturing process variations that degrade their performance (classification accuracy). However, to test the same post-manufacture, the (image) dataset used to train the associated machine learning applications may not be available to the RRAM crossbar manufacturer for privacy reasons. As such, the performance of DNNs needs to be assessed with carefully crafted dataset-agnostic synthetic test images that expose anomalies in the crossbar manufacturing process to the maximum extent possible. In this work, we propose a dataset-agnostic post-manufacture testing framework for RRAM-based DNNs using Entropy Guided Image Synthesis (EGIS). We first create a synthetic image dataset such that the DNN outputs corresponding to the synthetic images minimize an entropy-based loss metric. Next, a small subset (consisting of 10-20 images) of the synthetic image dataset, called the compact image dataset, is created to expedite testing. The response of the device under test (DUT) to the compact image dataset is passed to a machine learning based outlier detector for pass/fail labeling of the DUT. It is seen that the test accuracy using such synthetic test images is very close to that of contemporary test methods.more » « less
-
The kernel two-sample test based on the maximum mean discrepancy is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power of the kernel test by combining maximum mean discrepancy estimates over multiple kernels using their Mahalanobis distance. We derive the asymptotic null distribution of the proposed test statistic and use a multiplier bootstrap approach to efficiently compute the rejection region. The resulting test is universally consistent and, since it is obtained by aggregating over a collection of kernels/bandwidths, is more powerful in detecting a wide range of alternatives in finite samples. We also derive the distribution of the test statistic for both fixed and local contiguous alternatives. The latter, in particular, implies that the proposed test is statistically efficient, that is, it has nontrivial asymptotic (Pitman) efficiency. The consistency properties of the Mahalanobis and other natural aggregation methods are also explored when the number of kernels is allowed to grow with the sample size. Extensive numerical experiments are performed on both synthetic and real-world datasets to illustrate the efficacy of the proposed method over single-kernel tests. The computational complexity of the proposed method is also studied, both theoretically and in simulations. Our asymptotic results rely on deriving the joint distribution of the maximum mean discrepancy estimates using the framework of multiple stochastic integrals, which is more broadly useful, specifically, in understanding the efficiency properties of recently proposed adaptive maximum mean discrepancy tests based on kernel aggregation and also in developing more computationally efficient, linear-time tests that combine multiple kernels. We conclude with an application of the Mahalanobis aggregation method for kernels with diverging scaling parameters.more » « less
An official website of the United States government

Full Text Available