Analog photonic solutions offer unique opportunities to address complex computational tasks with unprecedented performance in terms of energy dissipation and speeds, overcoming current limitations of modern computing architectures based on electron flows and digital approaches. The lack of modularization and lumped element reconfigurability in photonics has prevented the transition to an all-optical analog computing platform. Here, we explore, using numerical simulation, a nanophotonic platform based on epsilon-near-zero materials capable of solving in the analog domain partial differential equations (PDE). Wavelength stretching in zero-index media enables highly nonlocal interactions within the board based on the conduction of electric displacement, which can be monitored to extract the solution of a broad class of PDE problems. By exploiting the experimentally achieved control of deposition technique through process parameters, used in our simulations, we demonstrate the possibility of implementing the proposed nano-optic processor using CMOS-compatible indium-tin-oxide, whose optical properties can be tuned by carrier injection to obtain programmability at high speeds and low energy requirements. Our nano-optical analog processor can be integrated at chip-scale, processing arbitrary inputs at the speed of light.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Communications Physics
- Nature Publishing Group
- Sponsoring Org:
- National Science Foundation
More Like this
This paper presents an energy-efficient classification framework that performs human activity recognition (HAR). Typically, HAR classification tasks require a computational platform that includes a processor and memory along with sensors and their interfaces, all of which consume significant power. The presented framework employs microelectromechanical systems (MEMS) based Continuous Time Recurrent Neural Network (CTRNN) to perform HAR tasks very efficiently. In a real physical implementation, we show that the MEMS-CTRNN nodes can perform computing while consuming power on a nano-watts scale compared to the micro-watts state-of-the-art hardware. We also confirm that this huge power reduction doesn't come at the expense of reduced performance by evaluating its accuracy to classify the highly cited human activity recognition dataset (HAPT). Our simulation results show that the HAR framework that consists of a training module, and a network of MEMS-based CTRNN nodes, provides HAR classification accuracy for the HAPT that is comparable to traditional CTRNN and other Recurrent Neural Network (RNN) implantations. For example, we show that the MEMS-based CTRNN model average accuracy for the worst-case scenario of not using pre-processing techniques, such as quantization, to classify 5 different activities is 77.94% compared to 78.48% using the traditional CTRNN.
Innovative processor architectures aim to play a critical role in future sustainment of performance improvements under severe limitations imposed by the end of Moore’s Law. The Reconfigurable Optical Computer (ROC) is one such innovative, Post-Moore’s Law processor. ROC is designed to solve partial differential equations in one shot as opposed to existing solutions, which are based on costly iterative computations. This is achieved by leveraging physical properties of a mesh of optical components that behave analogously to lumped electrical components. However, virtualization is required to combat shortfalls of the accelerator hardware. Namely, 1) the infeasibility of building large photonic arrays to accommodate arbitrarily large problems, and 2) underutilization brought about by mismatches in problem and accelerator mesh sizes due to future advances in manufacturing technology. In this work, we introduce an architecture and methodology for light-weight virtualization of ROC which exploits advantages borne from optical computing technology. Specifically, we apply temporal and spatial virtualization to ROC and then extend the accelerator scheduling tradespace with the introduction of spectral virtualization. Additionally, we investigate multiple resource scheduling strategies for a system-on-chip (SoC)-based PDE acceleration architecture and show that virtual configuration management offers a speedup of approximately 2 ×. Finally, we show thatmore »
We present a hybrid optical-electrical analog deep learning (DL) accelerator, the first work to use incoherent optical signals for DL workloads. Incoherent optical designs are more attractive than coherent ones as the former can be more easily realized in practice. However, a significant challenge in analog DL accelerators, where multiply-accumulate operations are dominant, is that there is no known solution to perform accumulation using incoherent optical signals. We overcome this challenge by devising a hybrid approach: accumulation is done in the electrical domain, while multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently to tightly integrate electrical and optical devices into compact circuits. As such, our design fully realizes the ultra high-speed and high-energy-efficiency advantages of analog and optical computing. Our evaluation results using the MNIST benchmark show that our design achieves 2214× and 65× improvements in latency and energy, respectively, compared to a state-of-the-art memristor-based analog design.
We investigate the use of SmartNIC-accelerated servers to execute microservice-based applications in the data center. By offloading suitable microservices to the SmartNIC’s low-power processor, we can improve server energy-efficiency without latency loss. However, as a heterogeneous computing substrate in the data path of the host, SmartNICs bring several challenges to a microservice platform: network traffic routing and load balancing, microservice placement on heterogeneous hardware, and contention on shared SmartNIC resources. We present E3, a microservice execution platform for SmartNIC-accelerated servers. E3 follows the design philosophies of the Azure Service Fabric microservice platform and extends key system components to a SmartNIC to address the above-mentioned challenges. E3 employs three key techniques: ECMP-based load balancing via SmartNICs to the host, network topology-aware microservice placement, and a data-plane orchestrator that can detect SmartNIC overload. Our E3 prototype using Cavium LiquidIO SmartNICs shows that SmartNIC offload can improve cluster energy-efficiency up to 3× and cost efficiency up to 1.9× at up to 4% latency cost for common microservices, including real-time analytics, an IoT hub, and virtual network functions.
Abstract The exponential growth of information stored in data centers and computational power required for various data-intensive applications, such as deep learning and AI, call for new strategies to improve or move beyond the traditional von Neumann architecture. Recent achievements in information storage and computation in the optical domain, enabling energy-efficient, fast, and high-bandwidth data processing, show great potential for photonics to overcome the von Neumann bottleneck and reduce the energy wasted to Joule heating. Optically readable memories are fundamental in this process, and while light-based storage has traditionally (and commercially) employed free-space optics, recent developments in photonic integrated circuits (PICs) and optical nano-materials have opened the doors to new opportunities on-chip. Photonic memories have yet to rival their electronic digital counterparts in storage density; however, their inherent analog nature and ultrahigh bandwidth make them ideal for unconventional computing strategies. Here, we review emerging nanophotonic devices that possess memory capabilities by elaborating on their tunable mechanisms and evaluating them in terms of scalability and device performance. Moreover, we discuss the progress on large-scale architectures for photonic memory arrays and optical computing primarily based on memory performance.