skip to main content


Title: Photonic (computational) memories: tunable nanophotonics for data storage and computing
Abstract The exponential growth of information stored in data centers and computational power required for various data-intensive applications, such as deep learning and AI, call for new strategies to improve or move beyond the traditional von Neumann architecture. Recent achievements in information storage and computation in the optical domain, enabling energy-efficient, fast, and high-bandwidth data processing, show great potential for photonics to overcome the von Neumann bottleneck and reduce the energy wasted to Joule heating. Optically readable memories are fundamental in this process, and while light-based storage has traditionally (and commercially) employed free-space optics, recent developments in photonic integrated circuits (PICs) and optical nano-materials have opened the doors to new opportunities on-chip. Photonic memories have yet to rival their electronic digital counterparts in storage density; however, their inherent analog nature and ultrahigh bandwidth make them ideal for unconventional computing strategies. Here, we review emerging nanophotonic devices that possess memory capabilities by elaborating on their tunable mechanisms and evaluating them in terms of scalability and device performance. Moreover, we discuss the progress on large-scale architectures for photonic memory arrays and optical computing primarily based on memory performance.  more » « less
Award ID(s):
2105972 2028624 2003325
NSF-PAR ID:
10335695
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Nanophotonics
Volume:
0
Issue:
0
ISSN:
2192-8606
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Convolutional Neural Networks (CNNs) are widely used due to their effectiveness in various AI applications such as object recognition, speech processing, etc., where the multiply-and-accumulate (MAC) operation contributes to ∼95% of the computation time. From the hardware implementation perspective, the performance of current CMOS-based MAC accelerators is limited mainly due to their von-Neumann architecture and corresponding limited memory bandwidth. In this way, silicon photonics has been recently explored as a promising solution for accelerator design to improve the speed and power-efficiency of the designs as opposed to electronic memristive crossbars. In this work, we briefly study recent silicon photonics accelerators and take initial steps to develop an open-source and adaptive crossbar architecture simulator for that. Keeping the original functionality of the MNSIM tool [1], we add a new photonic mode that utilizes the pre-existing algorithm to work with a photonic Phase Change Memory (pPCM) based crossbar structure. With inputs from the CNN's topology, the accelerator configuration, and experimentally-benchmarked data, the presented simulator can report the optimal crossbar size, the number of crossbars needed, and the estimation of total area, power, and latency. 
    more » « less
  2. Biomimetic synaptic processes, which are imitated by functional memory devices in the computer industry, are a key focus of artificial intelligence (AI) research. It is critical to developing a memory technology that is compatible with Brain-Inspired Computing in order to eliminate the von Neumann bottleneck that restricts the efficiency of traditional computer designs. Due to restrictions such as high operation voltage, poor retention capacity, and high power consumption, silicon-based flash memory, which presently dominates the data storage devices market, is having difficulty meeting the requirements of future data storage device development. The developing resistive random-access memory (RRAM) has sparked intense investigation because of its simple two-terminal structure: two electrodes and a switching layer. RRAM has a resistive switching phenomenon which is a cycling behavior between the high resistance state and the low resistance state. This developing device type is projected to outperform traditional memory devices. Indium gallium zinc oxide (IGZO) has attracted great attention for the RRAM switching layer because of its high transparency and high atomic diffusion property of oxygen atoms. More importantly, by controlling the oxygen ratio in the sputter gas, its electrical properties can be easily tuned. The IGZO has been applied to the thin-film transistor (TFT), thus, it is very promising to integrate RRAM with TFT. In this work, we proposed IGZO-based RRAMs. ITO was chosen as the bottom electrode towards achieving a fully transparent memristor. And for the IGZO switching layer, we varied the O2/Ar ratio during the deposition to modify the oxygen vacancy of IGZO. Through the XPS measurement, we confirmed that the higher O2/Ar ratio can lower the oxygen vacancy concentration. We also chose ITO as the top electrode, for the comparison, two active metals copper and silver were tested for the top electrode materials. For our IGZO layer, the best ratio of O2/Ar is the middle value. And copper top electrode device has the most stable cycling switching and the silver one is perfect for large memory window, however, it encounters a stability issue. The optical transmission examination was performed using a UV-Vis spectrometer, and the average transmittance of the complete devices in the visible-light wavelength range was greater than 90%, indicating good transparency. 50nm, 100nm, and 150nm RS layers of IGZO RRAM were produced to explore the thickness dependency on the characteristics of the RS layer. Also, because the oxygen vacancy concentration influences the RS and RRAM performance, the oxygen partial pressure during IGZO sputtering was modified to maximize the property. Electrode selection is critical and can have a significant influence on the device's overall performance. As a result, Cu TE was chosen for our second type of device because Cu ion diffusion can aid in the development of conductive filaments (CF). Finally, between the TE and RS layers, a 5 nm SiO2 barrier layer was used to limit Cu penetration into the RS layer. Simultaneously, this SiO2 inserting layer can offer extra interfacial series resistance in the device, lowering the off current and, as a result, improving the on/off ratio and overall performance. In conclusion, transparent IGZO-based RRAMs have been created. The thickness of the RS layer and the sputtering conditions of the RS layer were modified to tailor the property of the RS layer. A series of TE materials and a barrier layer were incorporated into an IGZO-based RRAM and the performance was evaluated in order to design the TE material's diffusion capabilities to the RS layer and the BE. Our positive findings show that IGZO is a potential material for RRAM applications and overcoming the existing memory technology limitation. 
    more » « less
  3. As scaling of conventional memory devices has stalled, many high-end computing systems have begun to incorporate alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize the different types of memory efficiently, new data management strategies are needed to match application usage to the best available memory technology. However, current proposals for managing heterogeneous memories are limited, because they either (1) do not consider high-level application behavior when assigning data to different types of memory or (2) require separate program execution (with a representative input) to collect information about how the application uses memory resources. This work presents a new data management toolset to address the limitations of existing approaches for managing complex memories. It extends the application runtime layer with automated monitoring and management routines that assign application data to the best tier of memory based on previous usage, without any need for source code modification or a separate profiling run. It evaluates this approach on a state-of-the-art server platform with both conventional DDR4 SDRAM and non-volatile Intel Optane DC memory, using both memory-intensive high-performance computing (HPC) applications as well as standard benchmarks. Overall, the results show that this approach improves program performance significantly compared to a standard unguided approach across a variety of workloads and system configurations. The HPC applications exhibit the largest benefits, with speedups ranging from 1.4× to 7× in the best cases. Additionally, we show that this approach achieves similar performance as a comparable offline profiling-based approach after a short startup period, without requiring separate program execution or offline analysis steps. 
    more » « less
  4. The traditional von Neumann architecture limits the increase in computing efficiency and results in massive power consumption in modern computers due to the separation of storage and processing units. The novel neuromorphic computation system, an in-memory computing architecture with low power consumption, is aimed to break the bottleneck and meet the needs of the next generation of artificial intelligence (AI) systems. Thus, it is urgent to find a memory technology to implement the neuromorphic computing nanosystem. Nowadays, the silicon-based flash memory dominates non-volatile memory market, however, it is facing challenging issues to achieve the requirements of future data storage device development due to the drawbacks, such as scaling issue, relatively slow operation speed, and high voltage for program/erase operations. The emerging resistive random-access memory (RRAM) has prompted extensive research as its simple two-terminal structure, including top electrode (TE) layer, bottom electrode (BE) layer, and an intermediate resistive switching (RS) layer. It can utilize a temporary and reversible dielectric breakdown to cause the RS phenomenon between the high resistance state (HRS) and the low resistance state (LRS). RRAM is expected to outperform conventional memory device with the advantages, notably its low-voltage operation, short programming time, great cyclic stability, and good scalability. Among the materials for RS layer, indium gallium zinc oxide (IGZO) has shown attractive prospects in abundance and high atomic diffusion property of oxygen atoms, transparency. Additionally, its electrical properties can be easily modulated by controlling the stoichiometric ratio of indium and gallium as well as oxygen potential in the sputter gas. Moreover, since the IGZO can be applied to both the thin-film transistor (TFT) channel and RS layer, it has a great potential for fully integrated transparent electronics application. In this work, we proposed amorphous transparent IGZO-based RRAMs and investigated switching behaviors of the memory cells prepared with different top electrodes. First, ITO was choosing to serve as both TE and BE to achieve high transmittance. A multi-target magnetron sputtering system was employed to deposit all three layers (TE, RS, BE layers) on glass substrate. I-V characteristics were evaluated by a semiconductor parameter analyzer, and the bipolar RS feature of our RRAM devices was demonstrated by typical butterfly curves. The optical transmission analysis was carried out via a UV-Vis spectrometer and the average transmittance was around 80% out of entire devices in the visible-light wavelength range, implying high transparency. We adjusted the oxygen partial pressure during the sputtering of IGZO to optimize the property because the oxygen vacancy concentration governs the RS performance. Electrode selection is crucial and can impact the performance of the whole device. Thus, Cu TE was chosen for our second type of device because the diffusion of Cu ions can be beneficial for the formation of the conductive filament (CF). A ~5 nm SiO 2 barrier layer was employed between TE and RS layers to confine the diffusion of Cu into the RS layer. At the same time, this SiO 2 inserting layer can provide an additional interfacial series resistance in the device to lower the off current, consequently, improve the on/off ratio and whole performance. Finally, an oxygen affinity metal Ti was selected as the TE for our third type of device because the concentration of the oxygen atoms can be shifted towards the Ti electrode, which provides an oxygengettering activity near the Ti metal. This process may in turn lead to the formation of a sub-stoichiometric region in the neighboring oxide that is believed to be the origin of better performance. In conclusion, the transparent amorphous IGZO-based RRAMs were established. To tune the property of RS layer, the sputtering conditions of RS were varied. To investigate the influence of TE selections on switching performance of RRAMs, we integrated a set of TE materials, and a barrier layer on IGZO-based RRAM and compared the switch characteristics. Our encouraging results clearly demonstrate that IGZO is a promising material in RRAM applications and breaking the bottleneck of current memory technologies. 
    more » « less
  5. Classified as a complex big data analytics problem, DNA short read alignment serves as a major sequential bottleneck to massive amounts of data generated by next-generation sequencing platforms. With Von-Neumann computing architectures struggling to address such computationally-expensive and memory-intensive task today, Processing-in-Memory (PIM) platforms are gaining growing interests. In this paper, an energy-efficient and parallel PIM accelerator (AlignS) is proposed to execute DNA short read alignment based on an optimized and hardware-friendly alignment algorithm. We first develop AlignS platform that harnesses SOT-MRAM as computational memory and transforms it to a fundamental processing unit for short read alignment. Accordingly, we present a novel, customized, highly parallel read alignment algorithm that only seeks the proposed simple and parallel in-memory operations (i.e. comparisons and additions). AlignS is then optimized through a new correlated data partitioning and mapping methodology that allows local storage and processing of DNA sequence to fully exploit the algorithm-level's parallelism, and to accelerate both exact and inexact matches. The device-to-architecture co-simulation results show that AlignS improves the short read alignment throughput per Watt per mm^2 by ~12X compared to the ASIC accelerator. Compared to recent FM-index-based ReRAM platform, AlignS achieves 1.6X higher throughput per Watt. 
    more » « less