skip to main content

Search for: All records

Creators/Authors contains: "Das, Anup"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The paper develops a methodology for the online built-in self-testing of deep neural network (DNN) accelerators to validate the correct operation with respect to their functional specifications. The DNN of interest is realized in the hardware to perform in-memory computing using non-volatile memory cells as computational units. Assuming a functional fault model, we develop methods to generate pseudorandom and structured test patterns to detect hardware faults. We also develop a test-sequencing strategy that combines these different classes of tests to achieve high fault coverage. The testing methodology is applied to a broad class of DNNs trained to classify images from the MNIST, Fashion-MNIST, and CIFAR-10 datasets. The goal is to expose hardware faults which may lead to the incorrect classification of images. We achieve an average fault coverage of 94% for these different architectures, some of which are large and complex.
    Free, publicly-accessible full text available August 1, 2023
  2. Spiking Neural Networks (SNNs) are an emerging computation model that uses event-driven activation and bio-inspired learning algorithms. SNN-based machine learning programs are typically executed on tile-based neuromorphic hardware platforms, where each tile consists of a computation unit called a crossbar, which maps neurons and synapses of the program. However, synthesizing such programs on an off-the-shelf neuromorphic hardware is challenging. This is because of the inherent resource and latency limitations of the hardware, which impact both model performance, e.g., accuracy, and hardware performance, e.g., throughput. We propose DFSynthesizer, an end-to-end framework for synthesizing SNN-based machine learning programs to neuromorphic hardware. The proposed framework works in four steps. First, it analyzes a machine learning program and generates SNN workload using representative data. Second, it partitions the SNN workload and generates clusters that fit on crossbars of the target neuromorphic hardware. Third, it exploits the rich semantics of the Synchronous Dataflow Graph (SDFG) to represent a clustered SNN program, allowing for performance analysis in terms of key hardware constraints such as number of crossbars, dimension of each crossbar, buffer space on tiles, and tile communication bandwidth. Finally, it uses a novel scheduling algorithm to execute clusters on crossbars of the hardware, guaranteeing hardwaremore »performance. We evaluate DFSynthesizer with 10 commonly used machine learning programs. Our results demonstrate that DFSynthesizer provides a much tighter performance guarantee compared to current mapping approaches.« less
    Free, publicly-accessible full text available May 31, 2023
  3. Free, publicly-accessible full text available July 18, 2023
  4. An emerging use-case of machine learning (ML) is to train a model on a high-performance system and deploy the trained model on energy-constrained embedded systems. Neuromorphic hardware platforms, which operate on principles of the biological brain, can significantly lower the energy overhead of a machine learning inference task, making these platforms an attractive solution for embedded ML systems. We present a design-technology tradeoff analysis to implement such inference tasks on the processing elements (PEs) of a Non-Volatile Memory (NVM)-based neuromorphic hardware. Through detailed circuit-level simulations at scaled process technology nodes, we show the negative impact of technology scaling on the information-processing latency, which impacts the quality-of-service (QoS) of an embedded ML system. At a finer granularity, the latency inside a PE depends on 1) the delay introduced by parasitic components on its current paths, and 2) the varying delay to sense different resistance states of its NVM cells. Based on these two observations, we make the following three contributions. First, on the technology front, we propose an optimization scheme where the NVM resistance state that takes the longest time to sense is set on current paths having the least delay, and vice versa, reducing the average PE latency, which improvesmore »the QoS. Second, on the architecture front, we introduce isolation transistors within each PE to partition it into regions that can be individually power-gated, reducing both latency and energy. Finally, on the system-software front, we propose a mechanism to leverage the proposed technological and architectural enhancements when implementing a machine-learning inference task on neuromorphic PEs of the hardware. Evaluations with a recent neuromorphic hardware architecture show that our proposed design-technology co-optimization approach improves both performance and energy efficiency of machine-learning inference tasks without incurring high cost-per-bit.« less
    Free, publicly-accessible full text available March 21, 2023
  5. The insula plays a fundamental role in a wide range of adaptive human behaviors, but its electrophysiological dynamics are poorly understood. Here, we used human intracranial electroencephalographic recordings to investigate the electrophysiological properties and hierarchical organization of spontaneous neuronal oscillations within the insula. We analyzed the neuronal oscillations of the insula directly and found that rhythms in the theta and beta frequency oscillations are widespread and spontaneously present. These oscillations are largely organized along the anterior–posterior (AP) axis of the insula. Both the left and right insula showed anterior-­to-posterior decreasing gradients for the power of oscillations in the beta frequency band. The left insula also showed a posterior-to-anterior decreasing frequency gradient and an anterior-to-posterior decreasing power gradient in the theta frequency band. In addition to measuring the power of these oscillations, we also examined the phase of these signals across simultaneous recording channels and found that the insula oscillations in the theta and beta bands are traveling waves. The strength of the traveling waves in each frequency was positively correlated with the amplitude of each oscillation. However, the theta and beta traveling waves were uncoupled to each other in terms of phase and amplitude, which suggested that insular traveling wavesmore »in the theta and beta bands operate independently. Our findings provide new insights into the spatiotemporal dynamics and hierarchical organization of neuronal oscillations within the insula, which, given its rich connectivity with widespread cortical regions, indicates that oscillations and traveling waves have an important role in intrainsular and interinsular communications.« less
    Free, publicly-accessible full text available May 26, 2023
  6. Free, publicly-accessible full text available March 14, 2023
  7. Precise monitoring of respiratory rate in premature newborn infants is essential to initiating medical interventions as required. Wired technologies can be invasive and obtrusive to the patients. We propose a deep-learning-enabled wearable monitoring system for premature newborn infants, where respiratory cessation is predicted using signals that are collected wirelessly from a non-invasive wearable Bellypatch put on the infant’s body. We propose a five-stage design pipeline involving data collection and labeling, feature scaling, deep learning model selection with hyperparameter tuning, model training and validation, and model testing and deployment. The model used is a 1-D convolutional neural network (1DCNN) architecture with one convolution layer, one pooling layer, and three fully-connected layers, achieving 97.15% classification accuracy. To address the energy limitations of wearable processing, several quantization techniques are explored, and their performance and energy consumption are analyzed for the respiratory classification task. Results demonstrate a reduction of energy footprints and model storage overhead with a considerable degradation of the classification accuracy, meaning that quantization and other model compression techniques are not the best solution for respiratory classification problem on wearable devices. To improve accuracy while reducing the energy consumption, we propose a novel spiking neural network (SNN)-based respiratory classification solution, which canmore »be implemented on event-driven neuromorphic hardware platforms. To this end, we propose an approach to convert the analog operations of our baseline trained 1DCNN to their spiking equivalent. We perform a design-space exploration using the parameters of the converted SNN to generate inference solutions having different accuracy and energy footprints. We select a solution that achieves an accuracy of 93.33% with 18x lower energy compared to the baseline 1DCNN model. Additionally, the proposed SNN solution achieves similar accuracy as the quantized model with a 4× lower energy.« less
    Free, publicly-accessible full text available March 1, 2023
  8. Free, publicly-accessible full text available February 1, 2023