skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Parallel programming of an ionic floating-gate memory array for scalable neuromorphic computing
Neuromorphic computers could overcome efficiency bottlenecks inherent to conventional computing through parallel programming and readout of artificial neural network weights in a crossbar memory array. However, selective and linear weight updates and <10-nanoampere read currents are required for learning that surpasses conventional computing efficiency. We introduce an ionic floating-gate memory array based on a polymer redox transistor connected to a conductive-bridge memory (CBM). Selective and linear programming of a redox transistor array is executed in parallel by overcoming the bridging threshold voltage of the CBMs. Synaptic weight readout with currents <10 nanoamperes is achieved by diluting the conductive polymer with an insulator to decrease the conductance. The redox transistors endure >1 billion write-read operations and support >1-megahertz write-read frequencies.  more » « less
Award ID(s):
1739795
PAR ID:
10108135
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Science
Volume:
364
Issue:
6440
ISSN:
0036-8075
Page Range / eLocation ID:
570 to 574
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Devices with tunable resistance are highly sought after for neuromorphic computing. Conventional resistive memories, however, suffer from nonlinear and asymmetric resistance tuning and excessive write noise, degrading artificial neural network (ANN) accelerator performance. Emerging electrochemical random-access memories (ECRAMs) display write linearity, which enables substantially faster ANN training by array programing in parallel. However, state-of-the-art ECRAMs have not yet demonstrated stable and efficient operation at temperatures required for packaged electronic devices (~90°C). Here, we show that (semi)conducting polymers combined with ion gel electrolyte films enable solid-state ECRAMs with stable and nearly temperature-independent operation up to 90°C. These ECRAMs show linear resistance tuning over a >2× dynamic range, 20-nanosecond switching, submicrosecond write-read cycling, low noise, and low-voltage (±1 volt) and low-energy (~80 femtojoules per write) operation combined with excellent endurance (>10 9 write-read operations at 90°C). Demonstration of these high-performance ECRAMs is a fundamental step toward their implementation in hardware ANNs. 
    more » « less
  2. Abstract Despite superconductor electronics (SCE) advantages, the realization of SCE logic faces a significant challenge due to the absence of dense and scalable nonvolatile memory. While various nonvolatile memory technologies, including Non-destructive readout, vortex transitional memory, and magnetic memory, have been explored, designing a dense crossbar array and achieving a superconductor random-access memory remains challenging. This work introduces a novel, nonvolatile, high-density, and scalable vortex-based memory design for SCE logic called bistable vortex memory. Our proposed design addresses scaling issues with an estimated area of 10 × 10 um2while boasting zero static power with the dynamic energy consumption of 12 aJ for single-bit read and write operations. The current summation capability enables analog operations for in-memory or near-memory computational tasks. We demonstrate the efficacy of our approach with a 32 × 32 superconductor memory array operating at 20 GHz. Additionally, we showcase the accumulation property of the memory through analog simulations conducted on an 8 × 8 superconductor crossbar array. 
    more » « less
  3. Phase Change Memory (PCM) is an attractive candidate for main memory, as it offers non-volatility and zero leakage power while providing higher cell densities, longer data retention time, and higher capacity scaling compared to DRAM. In PCM, data is stored in the crystalline or amorphous state of the phase change material. The typical electrically controlled PCM (EPCM), however, suffers from longer write latency and higher write energy compared to DRAM and limited multi-level cell (MLC) capacities. These challenges limit the performance of data-intensive applications running on computing systems with EPCMs. Recently, researchers demonstrated optically controlled PCM (OPCM) cells with support for 5bits/cellin contrast to 2bits/cellin EPCM. These OPCM cells can be accessed directly with optical signals that are multiplexed in high-bandwidth-density silicon-photonic links. The higher MLC capacity in OPCM and the direct cell access using optical signals enable an increased read/write throughput and lower energy per access than EPCM. However, due to the direct cell access using optical signals, OPCM systems cannot be designed using conventional memory architecture. We need a complete redesign of the memory architecture that is tailored to the properties of OPCM technology. This article presents the design of a unified network and main memory system called COSMOS that combines OPCM and silicon-photonic links to achieve high memory throughput. COSMOS is composed of a hierarchical multi-banked OPCM array with novel read and write access protocols. COSMOS uses an Electrical-Optical-Electrical (E-O-E) control unit to map standard DRAM read/write commands (sent in electrical domain) from the memory controller on to optical signals that access the OPCM cells. Our evaluation of a 2.5D-integrated system containing a processor and COSMOS demonstrates2.14 ×average speedup across graph and HPC workloads compared to an EPCM system. COSMOS consumes3.8×lower read energy-per-bit and5.97×lower write energy-per-bit compared to EPCM. COSMOS is the first non-volatile memory that provides comparable performance and energy consumption as DDR5 in addition to increased bit density, higher area efficiency, and improved scalability. 
    more » « less
  4. Motivated by the significantly higher cost of writing than reading in emerging memory technologies, we consider parallel algorithm design under such asymmetric read-write costs, with the goal of reducing the number of writes while preserving work-efficiency and low span. We present a nested-parallel model of computation that combines (i) small per-task stack-allocated memories with symmetric read-write costs and (ii) an unbounded heap-allocated shared memory with asymmetric read-write costs, and show how the costs in the model map efficiently onto a more concrete machine model under a work-stealing scheduler. We use the new model to design reduced write, work-efficient, low span parallel algorithms for a number of fundamental problems such as reduce, list contraction, tree contraction, breadth-first search, ordered filter, and planar convex hull. For the latter two problems, our algorithms are output-sensitive in that the work and number of writes decrease with the output size. We also present a reduced write, low span minimum spanning tree algorithm that is nearly work-efficient (off by the inverse Ackermann function). Our algorithms reveal several interesting techniques for significantly reducing shared memory writes in parallel algorithms without asymptotically increasing the number of shared memory reads. 
    more » « less
  5. With the prosperous development of Deep Neural Network (DNNs), numerous Process-In-Memory (PIM) designs have emerged to accelerate DNN models with exceptional throughput and energy-efficiency. PIM accelerators based on Non-Volatile Memory (NVM) or volatile memory offer distinct advantages for computational efficiency and performance. NVM based PIM accelerators, demonstrated success in DNN inference, face limitations in on-device learning due to high write energy, latency, and instability. Conversely, fast volatile memories, like SRAM, offer rapid read/write operations for DNN training, but suffer from significant leakage currents and large memory footprints. In this paper, for the first time, we present a fully-digital sparse processing in hybrid NVM-SRAM design, synergistically combines the strengths of NVM and SRAM, tailored for on-device continual learning. Our designed NVM and SRAM based PIM circuit macros could support both storage and processing of N:M structured sparsity pattern, significantly improving the storage and computing efficiency. Exhaustive experiments demonstrate that our hybrid system effectively reduces area and power consumption while maintaining high accuracy, offering a scalable and versatile solution for on-device continual learning. 
    more » « less