skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Stochastic Computing Architectures for Lightweight LSTM Neural Networks
For emerging edge and near-sensor systems to perform hard classification tasks locally, they must avoid costly communication with the cloud. This requires the use of compact classifiers such as recurrent neural networks of the long short term memory (LSTM) type, as well as a low-area hardware technology such as stochastic computing (SC). We study the benefits and costs of applying SC to LSTM design. We consider a design space spanned by fully binary (non-stochastic), fully stochastic, and several hybrid (mixed) LSTM architectures, and design and simulate examples of each. Using standard classification benchmarks, we show that area and power can be reduced up to 47% and 86% respectively with little or no impact on classification accuracy. We demonstrate that fully stochastic LSTMs can deliver acceptable accuracy despite accumulated errors. Our results also suggest that ReLU is preferable to tanh as an activation function in stochastic LSTMs  more » « less
Award ID(s):
2006704
PAR ID:
10324354
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2022 25th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The widespread use of distributed energy sources (DERs) raises significant challenges for power system design, planning, and operation, leading to wide adaptation of tools on hosting capacity analysis (HCA). Traditional HCA methods conduct extensive power flow analysis. Due to the computation burden, these time-consuming methods fail to provide online hosting capacity (HC) in large distribution systems. To solve the problem, we first propose a deep learning-based problem formulation for HCA, which conducts offline training and determines HC in real time. The used learning model, long short-term memory (LSTM), implements historical time-series data to capture periodical patterns in distribution systems. However, directly applying LSTMs suffers from low accuracy due to the lack of consideration on spatial information, where location information like feeder topology is critical in nodal HCA. Therefore, we modify the forget gate function to dual forget gates, to capture the spatial correlation within the grid. Such a design turns the LSTM into the Spatial-Temporal LSTM (ST-LSTM). Moreover, as voltage violations are the most vital constraints in HCA, we design a voltage sensitivity gate to increase accuracy further. The results of LSTMs and ST-LSTMs on feeders, such as IEEE 34-, 123-bus feeders, and utility feeders, validate our designs. 
    more » « less
  2. Abstract— Recent advances in near-sensor computing have prompted the need to design low-cost digital filters for edge devices. Stochastic computing (SC), leveraging its probabilistic bit-streams, has emerged as a compelling alternative to traditional deterministic computing for filter design. This paper examines error tolerance, area and power efficiency, and accuracy loss in SC-based digital filters. Specifically, we investigate the impact of various stochastic number generators and increased filter complexity on both FIR and IIR filters. Our results indicate that in an error-free environment, SC exhibits a 49% area advantage and a 64% power efficiency improvement, albeit with a slight loss of accuracy, compared to traditional binary implementations. Furthermore, when the input bitstreams are subject to a 2% bit-flip error rate, SC FIR and SC IIR filters have a much smaller performance degradation (1.3X and 1.9X, respectively) than comparable binary filters. In summary, this work provides useful insights into the advantages of stochastic computing in digital filter design, showcasing its robust error resilience, significant area and power efficiency gains, and trade-offs in accuracy compared to traditional binary approaches. 
    more » « less
  3. Designing low-cost filterbanks is important due to severe resource limitations imposed by hearing aid size. Here, we develop a novel FIR filterbank employing stochastic computing (SC). SC-based filters use (pseudo)-random bitstreams to efficiently perform the core filtering operation. We demonstrate that SC is well-suited to low-cost filterbank design and compare our SC filterbank to a conventional sequential binary (SB) design. We show that the SC design achieves the same accuracy and latency as the SB one, with an exceptionally large 70% reduction in chip area. The power consumption of our proposed SC filterbank is 38-96% that of the SB design. 
    more » « less
  4. Stochastic computing (SC) is a low-cost computational paradigm that has promising applications in digital filter design, image processing, and neural networks. Fundamental to these applications is the weighted addition operation, which is most often implemented by a multiplexer (mux) tree. Mux-based adders have very low area but typically require long bitstreams to reach practical accuracy thresholds when the number of summands is large. In this work, we first identify the main contributors to mux adder error. We then demonstrate with analysis and experiment that two new techniques, precise sampling and full correlation, can target and mitigate these error sources. Implementing these techniques in hardware leads to the design of CeMux (Correlation-enhanced Multiplexer), a stochastic mux adder that is significantly more accurate and uses much less area than traditional weighted adders. We compare CeMux to other SC and hybrid designs for an electrocardiogram filtering case study that employs a large digital filter. One major result is that CeMux is shown to be accurate even for large input sizes. CeMux's higher accuracy leads to a latency reduction of 4× to 16× over other designs. Furthermore, CeMux uses about 35% less area than existing designs, and we demonstrate that a small amount of accuracy can be traded for a further 50% reduction in area. Finally, we compare CeMux to a conventional binary design and we show that CeMux can achieve a 50% to 73% area reduction for similar power and latency as the conventional design but at a slightly higher level of error. 
    more » « less
  5. We propose a new Scaled Population (SP) based arithmetic computation approach that achieves considerable improvements over existing stochastic computing (SC) techniques. First, SP arithmetic introduces scaling operations that significantly reduce the numerical errors as compared to SC. Experiments show accuracy improvements of a single multiplication and addition operation by 6.3X and 4X, respectively. Secondly, SP arithmetic erases the inherent serialization associated with stochastic computing, thereby significantly improves the computational delays. We design each of the operations of SP arithmetic to take O(1) gate delays, and eliminate the need of serially iterating over the bits of the population vector. Our SP approach improves the area, delay and power compared with conventional stochastic computing on an FPGA-based implementation. We also apply our SP scheme on a handwritten digit recognition application (MNIST), improving the recognition accuracy by 32.79% compared to SC. 
    more » « less