A coherent multi-dimensional photonic tensor accelerator performing high-speed matrix-matrix multiplication is proposed and demonstrated. A pattern recognition experiment is demonstrated at a 25Gbps modulation speed exploiting orthogonal dimensions of light including time, wavelength, and spatial mode.
more »
« less
Vector-mode Multiplexing For Photonic Tensor Accelerator
We propose a coherent multi-dimensional (wavelength, spatial mode, polarization, etc.) photonic tensor accelerator capable of performing high-speed artificial neural network computation. High-speed matrix-vector and matrix-matrix multiplication were experimentally demonstrated.
more »
« less
- Award ID(s):
- 1932858
- PAR ID:
- 10351215
- Date Published:
- Journal Name:
- OptoElectronics and Communications Conference (OECC) and International Conference on Photonics in Switching and Computing (PSC)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
As one of the most promising future fundamental devices, memristor has its unique advantage on implementing low-power high-speed matrix multiplication. Taking advantage of the high performance on basic matrix operation and flexibilitys of memristor crossbars, in this paper, we investigate both discrete Fourier transformation (DFT) and miltiple-input and multi-output (MIMO) detection unit in baseband processor. We reformulate the signal processing algorithms and model structures into a matrix-based framework, and present a memristor crossbar based DFT module design and MIMO detector module design. For both designs, experimental results demonstrate significant gains in speed and power efficiency compared with traditional CMOS-based designs.more » « less
-
Matrix multiplication is a fundamental building block for large scale computations arising in various applications, including machine learning. There has been significant recent interest in using coding to speed up distributed matrix multiplication, that are robust to stragglers (i.e., machines that may perform slower computations). In many scenarios, instead of exact computation, approximate matrix multiplication, i.e., allowing for a tolerable error is also sufficient. Such approximate schemes make use of randomization techniques to speed up the computation process. In this paper, we initiate the study of approximate coded matrix multiplication, and investigate the joint synergies offered by randomization and coding. Specifically, we propose two coded randomized sampling schemes that use (a) codes to achieve a desired recovery threshold and (b) random sampling to obtain approximation of the matrix multiplication. Tradeoffs between the recovery threshold and approximation error obtained through random sampling are investigated for a class of coded matrix multiplication schemes.more » « less
-
In this work, we present GraphS architecture, which transforms current Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) to massively parallel computational units capable of accelerating graph processing applications. GraphS can be leveraged to greatly reduce energy consumption dealing with underlying adjacency matrix computations, eliminating unnecessary off-chip accesses and providing ultra-high internal bandwidth. The device-to-architecture co-simulation for three social network data-sets indicate roughly 3.6X higher energy efficiency and 5.3X speed-up over recent ReRAM crossbar. It achieves ~4X higher energy-efficiency and 5.1X speed-up over recent processing-in-DRAM acceleration methods.more » « less
-
In this paper, we propose GraphiDe, a novel DRAM-based processing-in-memory (PIM) accelerator for graph processing. It transforms current DRAM architecture to massively parallel computational units exploiting the high internal bandwidth of the modern memory chips to accelerate various graph processing applications. GraphiDe can be leveraged to greatly reduce energy consumption and latency dealing with underlying adjacency matrix computations by eliminating unnecessary off-chip accesses. The extensive circuit-architecture simulations over three social network data-sets indicate that GraphiDe achieves on average 3.1x energy-efficiency improvement and 4.2x speed-up over the recent DRAM based PIM platform. It achieves ~59x higher energy-efficiency and 83x speed-up over GPU-based acceleration methods.more » « less
An official website of the United States government

