skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Majority-Based Spin-CMOS Primitives for Approximate Computing
Promising for digital signal processing applications, approximate computing has been extensively considered to tradeoff limited accuracy for improvements in other circuit metrics such as area, power, and performance. In this paper, approximate arithmetic circuits are proposed by using emerging nanoscale spintronic devices. Leveraging the intrinsic current-mode thresholding operation of spintronic devices, we initially present a hybrid spin-CMOS majority gate design based on a composite spintronic device structure consisting of a magnetic domain wall motion stripe and a magnetic tunnel junction. We further propose a compact and energy-efficient accuracy-configurable adder design based on the majority gate. Unlike most previous approximate circuit designs that hardwire a constant degree of approximation, this design is adaptive to the inherent resilience in various applications to different degrees of accuracy. Subsequently, we propose two new approximate compressors for utilization in fast multiplier designs. The device-circuit SPICE simulation shows 34.58% and 66% improvement in power consumption, respectively, for the accurate and approximate modes of the accuracy-configurable adder, compared to the recently reported domain wall motion-based full adder design. In addition, the proposed accuracy-configurable adder and approximate compressors can be efficiently utilized in the discrete cosine transform (DCT) as a widely-used digital image processing algorithm. The results indicate that the DCT and inverse DCT (IDCT) using the approximate multiplier achieve ~2x energy saving and 3x speed-up compared to an exactly-designed circuit, while achieving comparable quality in its output result.  more » « less
Award ID(s):
1740126
PAR ID:
10094214
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE Transactions on Nanotechnology
Volume:
17
Issue:
4
ISSN:
1536-125X
Page Range / eLocation ID:
795 to 806
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The domain wall-magnetic tunnel junction (DW-MTJ) is a spintronic device that enables efficient logic circuit design because of its low energy consumption, small size, and non-volatility. Furthermore, the DW-MTJ is one of the few spintronic devices for which a direct cascading mechanism is experimentally demonstrated without any extra buffers; this enables potential design and fabrication of a large-scale DW-MTJ logic system. However, DW-MTJ logic relies on the conversion between electrical signals and magnetic states which is sensitive to process imperfection. Therefore, it is important to analyze the robustness of such DW-MTJ devices to anticipate the system reliability before fabrication. Here we propose a new DW-MTJ model that integrates the impacts of process variation to enable the analysis and optimization of DW-MTJ logic. This will allow circuit and device design that enhances the robustness of DW-MTJ logic and advances the development of energy-efficient spintronic computing systems. 
    more » « less
  2. This paper presents a configurable binary design library including fundamental arithmetic circuits like full-adder, full-subtractor, binary multiplier, shifter, and more. The Chisel Hardware Construction Language (HCL) is employed to build the parameterizable designs with different precision including half-word, word, double-word, and quad-word. Chisel HCL is an open-source embedded domain-specific language that inherits the object-oriented and functional programming aspects of Scala for constructing hardware. Experimental results show the same accuracy achieved by our proposed work compared with the Verilog HDL implementations. The hardware cost in terms of slice count, power consumption, and the maximum clock frequency is further estimated. Compared with traditional design intellectual properties (IPs) provided by IP vendors, our proposed work is configurable and expandable to the other arithmetic implementations and projects. 
    more » « less
  3. Spin-orbit torque (SOT) driven domain wall motion has attracted significant attention as the basis for a variety of spintronic devices due to its potential use as a high speed, low power means to manipulate the magnetic state of an object. While most previous attention has focused on ultrathin films wherein the material thickness is significantly less than the magnetic exchange length, recent reports have suggested unique dynamics may be achieved in intermediate and high thickness films. We used micromagnetic modelling to explore the role of the vertically non-uniform spin textures associated with the domain wall in nanowires of varying thickness on SOT driven domain wall motion. We found large velocity asymmetries between Bloch chiralities near the current density required for reversal of the Bloch component of the magnetization and linked these asymmetries to a gradual reorientation of the domain wall structure which drives a non-negligible, chiral Néel component of the domain wall. We further explored the influence of saturation magnetization, film thickness, the Dzyaloshinskii-Moriya interaction, and in-plane fields on domain wall dynamics. These results provide a framework for the development of SOT based devices based on domain wall motion in nanowires beyond the ultrathin film limit. 
    more » « less
  4. Vector-matrix multiplication (VMM) is a core operation in many signal and data processing algorithms. Previous work showed that analog multipliers based on nonvolatile memories have superior energy efficiency as compared to digital counterparts at low-to-medium computing precision. In this paper, we propose extremely energy efficient analog mode VMM circuit with digital input/output interface and configurable precision. Similar to some previous work, the computation is performed by gate-coupled circuit utilizing embedded floating gate (FG) memories. The main novelty of our approach is an ultra-low power sensing circuitry, which is designed based on translinear Gilbert cell in topological combination with a floating resistor and a low-gain amplifier. Additionally, the digital-to-analog input conversion is merged with VMM, while current-mode algorithmic analog-to-digital circuit is employed at the circuit backend. Such implementations of conversion and sensing allow for circuit operation entirely in a current domain, resulting in high performance and energy efficiency. For example, post-layout simulation results for 400×400 5-bit VMM circuit designed in 55 nm process with embedded NOR flash memory, show up to 400 MHz operation, 1.68 POps/J energy efficiency, and 39.45 TOps/mm2 computing throughput. Moreover, the circuit is robust against process-voltage-temperature variations, in part due to inclusion of additional FG cells that are utilized for offset compensation. 
    more » « less
  5. From signal processing to emerging deep neural networks, a range of applications exhibit intrinsic error resilience. For such applications, approximate computing opens up new possibilities for energy-efficient computing by producing slightly inaccurate results using greatly simplified hardware. Adopting this approach, a variety of basic arithmetic units, such as adders and multipliers, have been effectively redesigned to generate approximate results for many error-resilient applications.In this work, we propose SECO, an approximate exponential function unit (EFU). Exponentiation is a key operation in many signal processing applications and more importantly in spiking neuron models, but its energy-efficient implementation has been inadequately explored. We also introduce a cross-layer design method for SECO to optimize the energy-accuracy trade-off. At the algorithm level, SECO offers runtime scaling between energy efficiency and accuracy based on approximate Taylor expansion, where the error is minimized by optimizing parameters using discrete gradient descent at design time. At the circuit level, our error analysis method efficiently explores the design space to select the energy-accuracy-optimal approximate multiplier at design time. In tandem, the cross-layer design and runtime optimization method are able to generate energy-efficient and accurate approximate EFU designs that are up to 99.7% accurate at a power consumption of 3.73 pJ per exponential operation. SECO is also evaluated on the adaptive exponential integrate-and-fire neuron model, yielding only 0.002% timing error and 0.067% value error compared to the precise neuron model. 
    more » « less