skip to main content

Title: Cryogenic Memory Architecture Integrating Spin Hall Effect based Magnetic Memory and Superconductive Cryotron Devices

One of the most challenging obstacles to realizing exascale computing is minimizing the energy consumption of L2 cache, main memory, and interconnects to that memory. For promising cryogenic computing schemes utilizing Josephson junction superconducting logic, this obstacle is exacerbated by the cryogenic system requirements that expose the technology’s lack of high-density, high-speed and power-efficient memory. Here we demonstrate an array of cryogenic memory cells consisting of a non-volatile three-terminal magnetic tunnel junction element driven by the spin Hall effect, combined with a superconducting heater-cryotron bit-select element. The write energy of these memory elements is roughly 8 pJ with a bit-select element, designed to achieve a minimum overhead power consumption of about 30%. Individual magnetic memory cells measured at 4 K show reliable switching with write error rates below 10−6, and a 4 × 4 array can be fully addressed with bit select error rates of 10−6. This demonstration is a first step towards a full cryogenic memory architecture targeting energy and performance specifications appropriate for applications in superconducting high performance and quantum computing control systems, which require significant memory resources operating at 4 K.

; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; « less
Publication Date:
Journal Name:
Scientific Reports
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper we propose a Highly Flexible InMemory (HieIM) computing platform using STT MRAM, which can be leveraged to implement Boolean logic functions without sacrificing memory functionality. It could pre-process data within memory to further reduce power hungry long distance communication between memory and processing units as in Von-Neumann computing system. HieIM can implement all the Boolean logic functions (AND/NAND, OR/NOR, XOR/XNOR) between any two cells in the same memory array, thus overcoming the `operand locality' problem in contemporary in-memory computing platform designs. To investigate the performance of HieIM, we test in-memory bulk bit-wise Boolean logic operations using different vector datasets, which shows ~ 8x energy saving and ~ 5x speedup compared to recent DRAM based in-memory computing platform. We further implement an in-memory data encryption engine design based on HieIM as another case study. With AES algorithm, it shows 51.5% and 68.9% lower energy consumption compared to CMOS-ASIC and CMOL based implementations, respectively.
  2. The emerging resistive random access memory (ReRAM) technology has been deemed as one of the most promising alternatives to DRAM in main memories, due to its better scalability, zero cell leakage and short read latency. The cross-point (CP) array enables ReRAM to obtain the theoretical minimum 4F^2 cell size by placing a cell at the cross-point of a word-line and a bit-line. However, ReRAM CP arrays suffer from large sneak current resulting in significant voltage drop that greatly prolongs the array RESET latency. Although prior works reduce the voltage drop in CP arrays, they either substantially increase the array peripheral overhead or cannot work well with wear leveling schemes. In this paper, we propose two array micro-architecture level techniques, dynamic RESET voltage regulation (DRVR) and partition RESET (PR), to mitigate voltage drop on both bit-lines and word-lines in ReRAM CP arrays. DRVR dynamically provides higher RESET voltage to the cells far from the write driver and thus encountering larger voltage drop on a bit-line, so that all cells on a bit-line share approximately the same latency during RESETs. PR decides how many and which cells to reset online to partition the CP array into multiple equivalent circuits with smaller word-linemore »resistance and voltage drop. Because DRVR and PR greatly reduce the array RESET latency, the ReRAM-based main memory lifetime under the worst case non-stop write traffic significantly decreases. To increase the CP array endurance, we further upgrade DRVR by providing lower RESET voltage to the cells suffering from less voltage drop on a word-line. Our experimental results show that, compared to the combination of prior voltage drop reduction techniques, our DRVR and PR improve the system performance by 11.7% and decrease the energy consumption by 46% averagely, while still maintaining >10-year main memory system lifetime.« less
  3. Abstract

    Elemental type-II superconducting niobium is the material of choice for superconducting radiofrequency cavities used in modern particle accelerators, light sources, detectors, sensors, and quantum computing architecture. An essential challenge to increasing energy efficiency in rf applications is the power dissipation due to residual magnetic field that is trapped during the cool down process due to incomplete magnetic field expulsion. New SRF cavity processing recipes that use surface doping techniques have significantly increased their cryogenic efficiency. However, the performance of SRF Nb accelerators still shows vulnerability to a trapped magnetic field. In this manuscript, we report the observation of a direct link between flux trapping and incomplete flux expulsion with spatial variations in microstructure within the niobium. Fine-grain recrystallized microstructure with an average grain size of 10–50 µm leads to flux trapping even with a lack of dislocation structures in grain interiors. Larger grain sizes beyond 100–400 µm do not lead to preferential flux trapping, as observed directly by magneto-optical imaging. While local magnetic flux variations imaged by magneto-optics provide clarity on a microstructure level, bulk variations are also indicated by variations in pinning force curves with sequential heat treatment studies. The key results indicate that complete control of the niobium microstructuremore »will help produce higher performance superconducting resonators with reduced rf losses1related to the magnetic flux trapping.

    « less
  4. In this paper, we propose a novel Spin-Transfer Torque Magnetic Random-Access Memory (STT-MRAM) array design that could simultaneously work as non-volatile memory and implement a reconfigure in-memory logic operation without add-on logic circuits to the memory chip. The computed output could be simply read out like a typical MRAM bit-cell through the modified peripheral circuit. Such intrinsic in-memory computation can be used to process data locally and transfers the “cooked” data to the primary processing unit (i.e. CPU or GPU) for complex computation with high precision requirement. It greatly reduces power-hungry and long distance data communication, and further leads to extreme parallelism within memory. In this work, we further propose an in-memory edge extraction algorithm as a case study to demonstrate the efficiency of in memory preprocessing methodology. The simulation results show that our edge extraction method reduces data communication as much as 8x for grayscale image, thus greatly reducing system energy consumption. Meanwhile, the F-measure result shows only ∼10% degradation compared to conventional edge detection operator, such as Prewitt, Sobel and Roberts.
  5. We offer a perspective on the prospects of ultrafast spintronics and opto-magnetism as a pathway to high-performance, energy-efficient, and non-volatile embedded memory in digital integrated circuit applications. Conventional spintronic devices, such as spin-transfer-torque magnetic-resistive random-access memory (STT-MRAM) and spin–orbit torque MRAM, are promising due to their non-volatility, energy-efficiency, and high endurance. STT-MRAMs are now entering into the commercial market; however, they are limited in write speed to the nanosecond timescale. Improvement in the write speed of spintronic devices can significantly increase their usefulness as viable alternatives to the existing CMOS-based devices. In this article, we discuss recent studies that advance the field of ultrafast spintronics and opto-magnetism. An optimized ferromagnet–ferrimagnet exchange-coupled magnetic stack, which can serve as the free layer of a magnetic tunnel junction (MTJ), can be optically switched in as fast as ∼3 ps. Integration of ultrafast magnetic switching of a similar stack into an MTJ device has enabled electrical readout of the switched state using a relatively larger tunneling magnetoresistance ratio. Purely electronic ultrafast spin–orbit torque induced switching of a ferromagnet has been demonstrated using ∼6 ps long charge current pulses. We conclude our Perspective by discussing some of the challenges that remain to be addressed tomore »accelerate ultrafast spintronics technologies toward practical implementation in high-performance digital information processing systems.

    « less