Recently, integrated optics has become a functional platform for implementing machine learning algorithms and, in particular, neural networks. Photonic integrated circuits can straightforwardly perform vector-matrix multiplications with high efficiency and low power consumption by using weighting mechanism through linear optics. However, this cannot be said for the activation function, i.e., “threshold,” which requires either nonlinear optics or an electro-optic module with an appropriate dynamic range. Even though all-optical nonlinear optics is potentially faster, its current integration is challenging and is rather inefficient. Here, we demonstrate an electroabsorption modulator based on an indium tin oxide layer monolithically integrated into silicon photonic waveguides, whose dynamic range is used as a nonlinear activation function of a photonic neuron. The thresholding mechanism is based on a photodiode, which integrates the weighed products, and whose photovoltage drives the electroabsorption modulator. The synapse and neuron circuit is then constructed to execute a 200-node MNIST classification neural network used for benchmarking the nonlinear activation function and compared with an equivalent electronic module.
more »
« less
This content will become publicly available on July 6, 2026
An Optical Feedforward Neural Network with Nonlinear Activation Function Provided by a RSAM
A feedforward optical neural network for image recognition is demonstrated through free-space optics in which the nonlinear activation function is provided by a resonant semiconductor saturable absorption mirror (RSAM). Although free-space optics with liquid crystal based spatial light modulator has the potential to allow parallel processing for improved performance, using a fiber-optic electrooptic modulator for parallel to serial conversion greatly simplifies the experimental setup. We show that this nonlinear activation function introduced by RSAM can provide a notable improvement of 8.1% to classification accuracy in comparison to a purely linear network when tested with the MNIST data set for image classification. The impact of noise is also investigated in system implementation.
more »
« less
- Award ID(s):
- 2211990
- PAR ID:
- 10631839
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3315-9777-1
- Page Range / eLocation ID:
- 1 to 7
- Format(s):
- Medium: X
- Location:
- Barcelona, Spain
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)The nonlinearity of activation functions used in deep learning models is crucial for the success of predictive models. Several simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU (L-ReLU) are commonly used in neural networks to impose the nonlinearity. In practice, these functions remarkably enhance the model accuracy. However, there is limited insight into the effects of nonlinearity in neural networks on their performance. Here, we investigate the performance of neural network models as a function of nonlinearity using ReLU and L-ReLU activation functions in the context of different model architectures and data domains. We use entropy as a measurement of the randomness, to quantify the effects of nonlinearity in different architecture shapes on the performance of neural networks. We show that the ReLU nonliearity is a better choice for activation function mostly when the network has sufficient number of parameters. However, we found that the image classification models with transfer learning seem to perform well with L-ReLU in fully connected layers. We show that the entropy of hidden layer outputs in neural networks can fairly represent the fluctuations in information loss as a function of nonlinearity. Furthermore, we investigate the entropy profile of shallow neural networks as a way of representing their hidden layer dynamics.more » « less
-
Lensfree holographic microscopy is a compact and cost-effective modality for imaging large fields of view with high resolution. When combined with automated image processing, it can be used for biomolecular sensing where biochemically functionalized micro- and nano-beads are used to label biomolecules of interest. Neural networks for image feature classification provide faster and more robust sensing results than traditional image processing approaches. While neural networks have been widely applied to other types of image classification problems, and even image reconstruction in lensfree holographic microscopy, it is unclear what type of network architecture performs best for the types of small object image classification problems involved in holographic-based sensors. Here, we apply a shallow convolutional neural network to this task, and thoroughly investigate how different layers and hyperparameters affect network performance. Layers include dropout, convolutional, normalization, pooling, and activation. Hyperparameters include dropout fraction, filter number and size, stride, and padding. We ultimately achieve a network accuracy of ∼83%, and find that the choice of activation layer is most important for maximizing accuracy. We hope that these results can be helpful for researchers developing neural networks for similar classification tasks.more » « less
-
Inkjet-printed circuits on flexible substrates are rapidly emerging as a key technology in flexible electronics, driven by their minimal fabrication process, cost-effectiveness, and environmental sustainability. Recent advancements in inkjet-printed devices and circuits have broadened their applications in both sensing and computing. Building on this progress, this work has developed a nonlinear computational element coined as mTanh to serve as an activation function in neural networks. Activation functions are essential in neural networks as they introduce nonlinearity, enabling machine learning models to capture complex patterns. However, widely used functions such as Tanh and sigmoid often suffer from the vanishing gradient problem, limiting the depth of neural networks. To address this, alternative functions like ReLU and Leaky ReLU have been explored, yet these also introduce challenges such as the dying ReLU issue, bias shifting, and noise sensitivity. The proposed mTanh activation function effectively mitigates the vanishing gradient problem, allowing for the development of deeper neural network architectures without compromising training efficiency. This study demonstrates the feasibility of mTanh as an activation function by integrating it into an Echo State Network to predict the Mackey–Glass time series signal. The results show that mTanh performs comparably to Tanh, ReLU, and Leaky ReLU in this task. Additionally, the vanishing gradient resistance of the mTanh function was evaluated by implementing it in a deep multi-layer perceptron model for Fashion MNIST image classification. The study indicates that mTanh enables the addition of 3–5 extra layers compared to Tanh and sigmoid, while exhibiting vanishing gradient resistance similar to ReLU. These results highlight the potential of mTanh as a promising activation function for deep learning models, particularly in flexible electronics applications.more » « less
-
This paper presents NeurEPDiff, a novel network to fast predict the geodesics in deformation spaces generated by a well known Euler-Poincaré differential equation (EPDiff). To achieve this, we develop a neural operator that for the first time learns the evolving trajectory of geodesic deformations parameterized in the tangent space of diffeomorphisms (a.k.a velocity fields). In contrast to previous methods that purely fit the training images, our proposed NeurEPDiff learns a nonlinear mapping function between the time-dependent velocity fields. A composition of integral operators and smooth activation functions is formulated in each layer of NeurEPDiff to effectively approximate such mappings. The fact that NeurEPDiff is able to rapidly provide the numerical solution of EPDiff (given any initial condition) results in a significantly reduced computational cost of geodesic shooting of diffeomorphisms in a high-dimensional image space. Additionally, the properties of discretization/resolution-invariant of NeurEPDiff make its performance generalizable to multiple image resolutions after being trained offline. We demonstrate the effectiveness of NeurEPDiff in registering two image datasets: 2D synthetic data and 3D brain resonance imaging (MRI). The registration accuracy and computational efficiency are compared with the state-of-the-art diffeomophic registration algorithms with geodesic shooting.more » « less
An official website of the United States government
