Stochastic computing (SC) is a digital design paradigm that foregoes the conventional binary encoding in favor of pseudo-random bitstreams. Stochastic circuits operate on the probability values of bitstreams, and often achieve low power, low area, and fault-tolerant computation. Most SC designs rely on the input bitstreams being independent or uncorrelated to obtain the best results. However, circuits have also been proposed that exploit deliberately correlated bitstreams to improve area or accuracy. In such cases, different sub-circuits may have different correlation requirements. A major barrier to multi-layer or hierarchical stochastic circuit design has been understanding how correlation propagates while meeting the correlation requirements for all its sub-circuits. In this paper, we introduce correlation matrices and extensions to probability transfer matrix (PTM) algebra to analyze complex correlation behavior, thereby alleviating the need for computationally intensive bit-wise simulation. We apply our new correlation analysis to two multi-layer SC image processing and neural network circuits and show that it helps designers to systematically reduce correlation error.
more »
« less
Multiplexer-majority chains: managing correlation and cost in stochastic number generation
ABSTRACT - High-cost stochastic number generators (SNGs) are the main source of stochastic numbers (SNs) in stochastic computing. Interacting SNs must usually be uncorrelated for satisfactory results, but deliberate correlation can sometimes dramatically reduce area and/or improve accuracy. However, very little is known about the correlation behavior of SNGs. In this work, a core SNG component, its probability conversion circuit (PCC), is analyzed to reveal important tradeoffs between area, correlation, and accuracy. We show that PCCs of the weighted binary generator (WBG) type cannot consistently generate correlated bitstreams, which leads to inaccurate outputs for some designs. In contrast, comparator-based PCCs (CMPs) can generate highly correlated bitstreams but are about twice as large as WBGs. To overcome these area-correlation limitations, a novel class of PCCs called multiplexer majority chains (MMCs) is introduced. Some MMCs are area efficient like WBGs but can generate highly correlated SNs like CMPs and can reduce the area of a filtering circuit by 30% while sacrificing only 7% accuracy. The large influence of PCC design on circuit area and accuracy is explored and suggestions are made for selecting the best PCC based on a target system’s correlation requirements.
more »
« less
- Award ID(s):
- 2006704
- PAR ID:
- 10415131
- Date Published:
- Journal Name:
- 17th Symposium on Nanoscale Architectures (NANOARCH)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Stochastic computing (SC) is a low-cost computational paradigm that has promising applications in digital filter design, image processing, and neural networks. Fundamental to these applications is the weighted addition operation, which is most often implemented by a multiplexer (mux) tree. Mux-based adders have very low area but typically require long bitstreams to reach practical accuracy thresholds when the number of summands is large. In this work, we first identify the main contributors to mux adder error. We then demonstrate with analysis and experiment that two new techniques, precise sampling and full correlation, can target and mitigate these error sources. Implementing these techniques in hardware leads to the design of CeMux (Correlation-enhanced Multiplexer), a stochastic mux adder that is significantly more accurate and uses much less area than traditional weighted adders. We compare CeMux to other SC and hybrid designs for an electrocardiogram filtering case study that employs a large digital filter. One major result is that CeMux is shown to be accurate even for large input sizes. CeMux's higher accuracy leads to a latency reduction of 4× to 16× over other designs. Furthermore, CeMux uses about 35% less area than existing designs, and we demonstrate that a small amount of accuracy can be traded for a further 50% reduction in area. Finally, we compare CeMux to a conventional binary design and we show that CeMux can achieve a 50% to 73% area reduction for similar power and latency as the conventional design but at a slightly higher level of error.more » « less
-
Abstract—Stochastic computing is a low-cost non-standard computer architecture that processes pseudo-random bitstreams. Its effectiveness, and that of other probabilistic methods, requires maintaining desired levels of correlation among interacting input bitstreams, for example, SCC = 0 or SCC = +1, where SCC is the stochastic cross-correlation metric. Correlation errors are systematic (bias-causing) errors that cannot be corrected by increasing bitstream length. A typical stochastic design C1 only controls correlation at its primary input lines. This is a fairly straightforward task, however it limits the scope of SC to “single layer,” usually combinational, designs. In situations where a second processing layer C2 follows C1, the output correlation of C1 must satisfy the input correlation needs of C2. This can be done by inserting a (sequential) correlation control layer S12 between C1 and C2, which incurs high area and delay overhead. S12 transforms intralayer bitstreams Z with unknown or undesired SCC values into numerically equivalent ones Z* with desired correlation. The fundamental problem of designing C1 to produce Z* directly, thereby dispensing with S12, which apparently has not been considered before, is addressed in this paper. We focus on two- layer designs C1C2 requiring SCC = +1 between layers, and present a method called COMAX for (re)designing C1 so that it outputs bitstreams with correlation that is as close as possible to +1. We demonstrate on a representative image processing application that, compared to alternative correlation control techniques, COMAX reduces area by about 50% without reducing output image quality.more » « less
-
Recent advances to hardware integration and realization of highly-efficient Compressive Sensing (CS) approaches have inspired novel circuit and architectural-level approaches. These embrace the challenge to design more optimal nonuniform CS solutions that consider device-level constraints for IoT applications wherein lifetime energy, device area, and manufacturing costs are highly-constrained, but meanwhile the sensing environment is rapidly changing. In this manuscript, we develop a novel adaptive hardware-based approach for non-uniform compressive sampling of sparse and time-varying signals. The proposed Adaptive Sampling of Sparse IoT signals via STochastic-oscillators (ASSIST) approach intelligently generates the CS measurement matrix by distributing the sensing energy among coefficients by considering the signal characteristics such as sparsity rate and noise level obtained in the previous time step. In our proposed approach, Magnetic Random Access Memory (MRAM)-based stochastic oscillators are utilized to generate the random bitstreams used in the CS measurement matrix. SPICE and MATLAB circuit-algorithm simulation results indicate that ASSIST efficiently achieves the desired non-uniform recovery of the original signals with varying sparsity rates and noise levels.more » « less
-
Abstract— Stochastic computing (SC) uses streams of pseudo-random bits to perform low-cost and error-tolerant numerical processing for applications like neural networks and digital filtering. A key operation in these domains is the summation of many hundreds of bit-streams, but existing SC adders are inflexible and unpredictable. Basic mux adders have low area but poor accuracy while other adders like accumulative parallel counters (APCs) have good accuracy but high area. This work introduces parallel sampling adders (PSAs), a novel weighted adder family that offers a favorable area-accuracy trade-off and provides great flexibility to large-scale SC adder design. Our experiments show that PSAs can sometimes achieve the same high accuracy as APCs, but at half the area cost. We also examine the behavior of large-scale SC adders in depth and uncover some surprising results. First, APC accuracy is shown to be sensitive to input correlation despite the common belief that APCs are correlation insensitive. Then, we show that mux-based adders are sometimes more accurate than APCs, which contradicts most prior studies. Explanations for these anomalies are given and a decorrelation scheme is proposed to improve APC accuracy by 4x for a digital filtering application.more » « less
An official website of the United States government

