The spatiotemporal nature of neuronal behavior in spiking neural networks (SNNs) makes SNNs promising for edge applications that require high energy efficiency. To realize SNNs in hardware, spintronic neuron implementations can bring advantages of scalability and energy efficiency. Domain wall (DW)-based magnetic tunnel junction (MTJ) devices are well suited for probabilistic neural networks given their intrinsic integrate-and-fire behavior with tunable stochasticity. Here, we present a scaled DW-MTJ neuron with voltage-dependent firing probability. The measured behavior was used to simulate a SNN that attains accuracy during learning compared to an equivalent, but more complicated, multi-weight DW-MTJ device. The validation accuracy during training was also shown to be comparable to an ideal leaky integrate and fire device. However, during inference, the binary DW-MTJ neuron outperformed the other devices after Gaussian noise was introduced to the Fashion-MNIST classification task. This work shows that DW-MTJ devices can be used to construct noise-resilient networks suitable for neuromorphic computing on the edge.
more »
« less
This content will become publicly available on December 1, 2026
Quantum-limited stochastic optical neural networks operating at a few quanta per activation
Abstract Energy efficiency in computation is ultimately limited by noise, with quantum limits setting the fundamental noise floor. Analog physical neural networks hold promise for improved energy efficiency compared to digital electronic neural networks. However, they are typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large (>10), and the noise can be treated as a perturbation. We study optical neural networks where all layers except the last are operated in the limit that each neuron can be activated by just a single photon, and as a result the noise on neuron activations is no longer merely perturbative. We show that by using a physics-based probabilistic model of the neuron activations in training, it is possible to perform accurate machine-learning inference in spite of the extremely high shot noise (SNR ~ 1). We experimentally demonstrated MNIST handwritten-digit classification with a test accuracy of 98% using an optical neural network with a hidden layer operating in the single-photon regime; the optical energy used to perform the classification corresponds to just 0.038 photons per multiply-accumulate (MAC) operation. Our physics-aware stochastic training approach might also prove useful with non-optical ultra-low-power hardware.
more »
« less
- Award ID(s):
- 1918549
- PAR ID:
- 10579710
- Publisher / Repository:
- Springer Nature
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 16
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Deep-learning models have become pervasive tools in science and engineering. However, their energy requirements now increasingly limit their scalability 1 . Deep-learning accelerators 2–9 aim to perform deep learning energy-efficiently, usually targeting the inference phase and often by exploiting physical substrates beyond conventional electronics. Approaches so far 10–22 have been unable to apply the backpropagation algorithm to train unconventional novel hardware in situ. The advantages of backpropagation have made it the de facto training method for large-scale neural networks, so this deficiency constitutes a major impediment. Here we introduce a hybrid in situ–in silico algorithm, called physics-aware training, that applies backpropagation to train controllable physical systems. Just as deep learning realizes computations with deep neural networks made from layers of mathematical functions, our approach allows us to train deep physical neural networks made from layers of controllable physical systems, even when the physical layers lack any mathematical isomorphism to conventional artificial neural network layers. To demonstrate the universality of our approach, we train diverse physical neural networks based on optics, mechanics and electronics to experimentally perform audio and image classification tasks. Physics-aware training combines the scalability of backpropagation with the automatic mitigation of imperfections and noise achievable with in situ algorithms. Physical neural networks have the potential to perform machine learning faster and more energy-efficiently than conventional electronic processors and, more broadly, can endow physical systems with automatically designed physical functionalities, for example, for robotics 23–26 , materials 27–29 and smart sensors 30–32 .more » « less
-
Spiking Neural Networks (SNN) are fast emerging as an alternative option to Deep Neural Networks (DNN). They are computationally more powerful and provide higher energy-efficiency than DNNs. While exciting at first glance, SNNs contain security-sensitive assets (e.g., neuron threshold voltage) and vulnerabilities (e.g., sensitivity of classification accuracy to neuron threshold voltage change) that can be exploited by the adversaries. We explore global fault injection attacks using external power supply and laser-induced local power glitches on SNN designed using common analog neurons to corrupt critical training parameters such as spike amplitude and neuron’s membrane threshold potential. We also analyze the impact of power-based attacks on the SNN for digit classification task and observe a worst-case classification accuracy degradation of −85.65%. We explore the impact of various design parameters of SNN (e.g., learning rate, spike trace decay constant, and number of neurons) and identify design choices for robust implementation of SNN. We recover classification accuracy degradation by 30–47% for a subset of power-based attacks by modifying SNN training parameters such as learning rate, trace decay constant, and neurons per layer. We also propose hardware-level defenses, e.g., a robust current driver design that is immune to power-oriented attacks, improved circuit sizing of neuron components to reduce/recover the adversarial accuracy degradation at the cost of negligible area, and 25% power overhead. We also propose a dummy neuron-based detection of voltage fault injection at ∼1% power and area overhead each.more » « less
-
Abstract Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e.combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we aim to obtain a global view of what neurons detect by studying their activations. In particular, we develop visualizations that show the shape of the activation space, the organizational principle behind neuron activations, and the relationships of these activations within a layer. Applying tools from topological data analysis, we presentTopoAct, a visual exploration system to study topological summaries of activation vectors. We present exploration scenarios usingTopoActthat provide valuable insights into learned representations of neural networks. We expectTopoActto give a topological perspective that enriches the current toolbox of neural network analysis, and to provide a basis for network architecture diagnosis and data anomaly detection.more » « less
-
Abstract Deep learning has become a widespread tool in both science and industry. However, continued progress is hampered by the rapid growth in energy costs of ever-larger deep neural networks. Optical neural networks provide a potential means to solve the energy-cost problem faced by deep learning. Here, we experimentally demonstrate an optical neural network based on optical dot products that achieves 99% accuracy on handwritten-digit classification using ~3.1 detected photons per weight multiplication and ~90% accuracy using ~0.66 photons (~2.5 × 10 −19 J of optical energy) per weight multiplication. The fundamental principle enabling our sub-photon-per-multiplication demonstration—noise reduction from the accumulation of scalar multiplications in dot-product sums—is applicable to many different optical-neural-network architectures. Our work shows that optical neural networks can achieve accurate results using extremely low optical energies.more » « less