The reliability of emerging neuromorphic compute fabrics is of great concern due to their widespread use in critical data-intensive applications. Ensuring such reliability is difficult due to the intensity of underlying computations (billions of parameters), errors induced by low power operation and the complex relationship between errors in computations and their effect on network performance accuracy. We study the problem of designing error-resilient neuromorphic systems where errors can stem from: (a) soft errors in computation of matrix-vector multiplications and neuron activations, (b) malicious trojan and adversarial security attacks and (c) effects of manufacturing process variations on analog crossbar arrays that can affect DNN accuracy. The core principle of error detection relies on embedded predictive neuron checks using invariants derived from the statistics of nominal neuron activation patterns of hidden layers of a neural network. Algorithmic encodings of hidden neuron function are also used to derive invariants for checking. A key contribution is designing checks that are robust to the inherent nonlinearity of neuron computations with minimal impact on error detection coverage. Once errors are detected, they are corrected using probabilistic methods due to the difficulties involved in exact error diagnosis in such complex systems. The technique is scalable across soft errors as well as a range of security attacks. The effects of manufacturing process variations are handled through the use of compact tests from which DNN performance can be assessed using learning techniques. Experimental results on a variety of neuromorphic test systems: DNNs, spiking networks and hyperdimensional computing are presented.
more »
« less
Encoded Check Driven Concurrent Error Detection in Particle Filters for Nonlinear State Estimation
In this paper we propose a framework for concurrent detection of soft computation errors in particle filters which are finding increasing use in robotics applications. The particle filter works by sampling the multi-variate probability distribution of the states of a system (samples called particles, each particle representing a vector of states) and projecting these into the future using appropriate nonlinear mappings. We propose the addition of a `check' state to the system as a linear combination of the system states for error detection. The check state produces an error signal corresponding to each particle, whose statistics are tracked across a sliding time window. Shifts in the error statistics across all particles are used to detect soft computation errors as well as anomalous sensor measurements. Simulation studies indicate that errors in particle filter computations can be detected with high coverage and low latency.
more »
« less
- Award ID(s):
- 1723997
- PAR ID:
- 10274775
- Date Published:
- Journal Name:
- International On-Line Testing Conference
- Page Range / eLocation ID:
- 1 to 6
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Emerging brain-inspired hyperdimensional computing (HDC) algorithms are vulnerable to timing and soft errors in associative memory used to store high-dimensional data representations. Such errors can significantly degrade HDC performance. A key challenge is error correction after an error in computation is detected. This work presents two novel error resilience frameworks for hyperdimensional computing systems. The first, called the checksum hypervector encoding (CHE) framework, relies on creation of a single additional hypervector that is a checksum of all the class hypervectors of the HDC system. For error resilience, elementwise validation of the checksum property is performed and those elements across all class vectors for which the property fails are removed from consideration. For an HDC system with K class hypervectors of dimension D, the second cross-hypervector clustering (CHC) framework clusters D, Kdimensional vectors consisting of the i-th element of each of the K HDC class hypervectors, 1 ≤ i ≤ K. Statistical properties of these vector clusters are checked prior to each hypervector query and all the elements of all K-dimensional vectors corresponding to statistical outlier vectors are removed as before. The choice of which framework to use is dictated by the complexity of the dataset to classify. Up to three orders of magnitude better resilience to errors than the state-of-the-art across multiple HDC high-dimensional encoding (representation) systems is demonstrated.more » « less
-
Emerging brain-inspired hyperdimensional computing (HDC) algorithms are vulnerable to timing and soft errors in associative memory used to store high-dimensional data representations. Such errors can significantly degrade HDC performance. A key challenge is error correction after an error in computation is detected. This work presents two novel error resilience frameworks for hyperdimensional computing systems. The first, called the checksum hypervector encoding (CHE) framework, relies on creation of a single additional hypervector that is a checksum of all the class hypervectors of the HDC system. For error resilience, elementwise validation of the checksum property is performed and those elements across all class vectors for which the property fails are removed from consideration. For an HDC system with K class hypervectors of dimension D, the second cross-hypervector clustering (CHC) framework clusters D, K-dimensional vectors consisting of the i-th element of each of the K HDC class hypervectors, 1 ≤ i ≤ K. Statistical properties of these vector clusters are checked prior to each hypervector query and all the elements of all K-dimensional vectors corresponding to statistical outlier vectors are removed as before. The choice of which framework to use is dictated by the complexity of the dataset to classify. Up to three orders of magnitude better resilience to errors than the state-of-the-art across multiple HDC high-dimensional encoding (representation) systems is demonstrated.more » « less
-
Abstract We propose a novel framework that combines state-of-the-art deep learning approaches with pre- and post-processing algorithms for particle detection in complex/heterogeneous backgrounds common in the manufacturing domain. Traditional methods, like size analyzers and those based on dilution, image processing, or deep learning, typically excel with homogeneous backgrounds. Yet, they often fall short in accurately detecting particles against the intricate and varied backgrounds characteristic of heterogeneous particle–substrate (HPS) interfaces in manufacturing. To address this, we've developed a flexible framework designed to detect particles in diverse environments and input types. Our modular framework hinges on model selection and AI-guided particle detection as its core, with preprocessing and postprocessing as integral components, creating a four-step process. This system is versatile, allowing for various preprocessing, AI model selections, and post-processing strategies. We demonstrate this with an entrainment-based particle delivery method, transferring various particles onto substrates that mimic the HPS interface. By altering particle and substrate properties (e.g., material type, size, roughness, shape) and process parameters (e.g., capillary number) during particle entrainment, we capture images under different ambient lighting conditions, introducing a range of HPS background complexities. In the preprocessing phase, we apply image enhancement and sharpening techniques to improve detection accuracy. Specifically, image enhancement adjusts the dynamic range and histogram, while sharpening increases contrast by combining the high pass filter output with the base image. We introduce an image classifier model (based on the type of heterogeneity), employing Transfer Learning with MobileNet as a Model Selector, to identify the most appropriate AI model (i.e., YOLO model) for analyzing each specific image, thereby enhancing detection accuracy across particle–substrate variations. Following image classification based on heterogeneity, the relevant YOLO model is employed for particle identification, with a distinct YOLO model generated for each heterogeneity type, improving overall classification performance. In the post-processing phase, domain knowledge is used to minimize false positives. Our analysis indicates that the AI-guided framework maintains consistent precision and recall across various HPS conditions, with the harmonic mean of these metrics comparable to those of individual AI model outcomes. This tool shows potential for advancing in-situ process monitoring across multiple manufacturing operations, including high-density powder-based 3D printing, powder metallurgy, extreme environment coatings, particle categorization, and semiconductor manufacturing.more » « less
-
Quantum states decohere through interaction with the environment. Quantum error correction can preserve coherence through active feedback wherein quantum information is encoded into a logical state with a high degree of symmetry. Perturbations are detected by measuring the symmetries of the state and corrected by applying gates based on these measurements. To measure the symmetries without perturbing the data, ancillary quantum states are required. Shor error correction uses a separate quantum state for the measurement of each symmetry. Steane error correction maps the perturbations onto a logical ancilla qubit, which is then measured to check several symmetries simultaneously. We experimentally compare Shor and Steane correction of bit flip errors using the Bacon-Shor code implemented in a chain of 23 trapped atomic ions. We find that the Steane method produces fewer errors after a single round of error correction and less disturbance to the data qubits without error correction.more » « less
An official website of the United States government

