Deep neural networks (DNNs) consist of layers of neurons interconnected by synaptic weights. A high bit-precision in weights is generally required to guarantee high accuracy in many applications. Minimizing error accumulation between layers is also essential when building large-scale networks. Recent demonstrations of photonic neural networks are limited in bit-precision due to cross talk and the high sensitivity of optical components (e.g., resonators). Here, we experimentally demonstrate a record-high precision of 9 bits with a dithering control scheme for photonic synapses. We then numerically simulated the impact with increased synaptic precision on a wireless signal classification application. This work could help realize the potential of photonic neural networks for many practical, real-world tasks.
more »
« less
AnalogVNN: A fully modular framework for modeling and optimizing photonic neural networks
In this paper, we present AnalogVNN, a simulation framework built on PyTorch that can simulate the effects of optoelectronic noise, limited precision, and signal normalization present in photonic neural network accelerators. We use this framework to train and optimize linear and convolutional neural networks with up to nine layers and ∼1.7 × 106 parameters, while gaining insights into how normalization, activation function, reduced precision, and noise influence accuracy in analog photonic neural networks. By following the same layer structure design present in PyTorch, the AnalogVNN framework allows users to convert most digital neural network models to their analog counterparts with just a few lines of code, taking full advantage of the open-source optimization, deep learning, and GPU acceleration libraries available through PyTorch.
more »
« less
- PAR ID:
- 10588248
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- APL Machine Learning
- Volume:
- 1
- Issue:
- 2
- ISSN:
- 2770-9019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In recent years, large convolutional neural networks have been widely used as tools for image deblurring, because of their ability in restoring images very precisely. It is well known that image deblurring is mathematically modeled as an ill-posed inverse problem and its solution is difficult to approximate when noise affects the data. Really, one limitation of neural networks for deblurring is their sensitivity to noise and other perturbations, which can lead to instability and produce poor reconstructions. In addition, networks do not necessarily take into account the numerical formulation of the underlying imaging problem when trained end-to-end. In this paper, we propose some strategies to improve stability without losing too much accuracy to deblur images with deep-learning-based methods. First, we suggest a very small neural architecture, which reduces the execution time for training, satisfying a green AI need, and does not extremely amplify noise in the computed image. Second, we introduce a unified framework where a pre-processing step balances the lack of stability of the following neural-network-based step. Two different pre-processors are presented. The former implements a strong parameter-free denoiser, and the latter is a variational-model-based regularized formulation of the latent imaging problem. This framework is also formally characterized by mathematical analysis. Numerical experiments are performed to verify the accuracy and stability of the proposed approaches for image deblurring when unknown or not-quantified noise is present; the results confirm that they improve the network stability with respect to noise. In particular, the model-based framework represents the most reliable trade-off between visual precision and robustness.more » « less
-
Though photonic computing systems offer advantages in speed, scalability, and power consumption, they often have a limited dynamic encoding range due to low signal-to-noise ratios. Compared to digital floating-point encoding, photonic fixed-point encoding limits the precision of photonic computing when applied to scientific problems. In the case of iterative algorithms such as those commonly applied in machine learning or differential equation solvers, techniques like precision decomposition and residue iteration can be applied to increase accuracy at a greater computing cost. However, the analog nature of photonic symbols allows for modulation of both amplitude and frequency, opening the possibility of encoding both the significand and exponent of floating-point values on photonic computing systems to expand the dynamic range without expending additional energy. With appropriate schema, element-wise floating-point multiplication can be performed intrinsically through the interference of light. Herein, we present a method for configurable, signed, floating-point encoding and multiplication on a limited precision photonic primitive consisting of a directly modulated Mach–Zehnder interferometer. We demonstrate this method using Newton's method to find the Golden Ratio within ±0.11%, with six-level exponent encoding for a signed trinary digit-equivalent significand, corresponding to an effective increase of 243× in the photonic primitive's dynamic range.more » « less
-
It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE). This observation motivated the introduction of so-called Neural ODEs, which allow more general discretization schemes with adaptive time stepping. Here, we propose ANODEV2, which is an extension of this approach that allows evolution of the neural network parameters, in a coupled ODE-based formulation. The Neural ODE method introduced earlier is in fact a special case of this new framework. We present the formulation of ANODEV2, derive optimality conditions, and implement the coupled framework in PyTorch. We present empirical results using several different configurations of ANODEV2, testing them on multiple models on CIFAR-10. We report results showing that this coupled ODE-based framework is indeed trainable, and that it achieves higher accuracy, as compared to the baseline models as well as the recently-proposed Neural ODE approach.more » « less
-
null (Ed.)Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the features of an input image via statistics of a batch of images and hence BN will bring the noise to the gradient of training loss. Previous works indicate that the noise is important for the optimization and generalization of deep neural networks, but too much noise will harm the performance of networks. In our paper, we offer a new point of view that the self-attention mechanism can help to regulate the noise by enhancing instance-specific information to obtain a better regularization effect. Therefore, we propose an attention-based BN called Instance Enhancement Batch Normalization (IEBN) that recalibrates the information of each channel by a simple linear transformation. IEBN has a good capacity of regulating the batch noise and stabilizing network training to improve generalization even in the presence of two kinds of noise attacks during training. Finally, IEBN outperforms BN with only a light parameter increment in image classification tasks under different network structures and benchmark datasets.more » « less
An official website of the United States government
