NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Trimming Down Large Spiking Vision Transformers Via Heterogeneous Quantization Search

https://doi.org/10.1109/ASAP65064.2025.00016

Xu, Boxun; Song, Yufei; Li, Peng (July 2025, IEEE)

Free, publicly-accessible full text available July 28, 2026
Bishop: Sparsified Bundling Spiking Transformers on Heterogeneous Cores with Error-constrained Pruning

https://doi.org/10.1145/3695053.3731063

Xu, Boxun; Yin, Yuxuan; Iyer, Vikram; Li, Peng (June 2025, International Symposium on Computer Architecture (ISCA))

Free, publicly-accessible full text available June 20, 2026
Backpropagation-based learning with local derivative approximation and memory replay in biologically plausible neural systems

https://doi.org/10.1016/j.neucom.2025.129804

Boone, Richard; Li, Peng (June 2025, Neurocomputing)

Free, publicly-accessible full text available June 1, 2026
HoSNNs: Adversarially-Robust Homeostatic Spiking Neural Networks with Adaptive Firing Thresholds

Geng, Hejia; Li, Peng (March 2025, Transactions on Machine Learning Research)

Free, publicly-accessible full text available March 1, 2026
Spiking Transformer Hardware Accelerators in 3D Integration

Xu, Boxun; Hwang, Junyoung; Vanna-iampikul, Pruek; Lim, Sung_Kyu; Li, Peng (October 2024, IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’24))

Spiking neural networks (SNNs) are powerful models of spatiotemporal computation and are well suited for deployment on resource-constrained edge devices and neuromorphic hardware due to their low power consumption. Leveraging attention mechanisms similar to those found in their artificial neural network counterparts, recently emerged spiking transformers have showcased promising performance and efficiency by capitalizing on the binary nature of spiking operations. Recognizing the current lack of dedicated hardware support for spiking transformers, this paper presents the first work on 3D spiking transformer hardware architecture and design methodology. We present an architecture and physical design co-optimization approach tailored specifically for spiking transformers. Through memory-on-logic and logic-on-logic stacking enabled by 3D integration, we demonstrate significant energy and delay improvements compared to conventional 2D CMOS integration.
more » « less
Full Text Available
Systolic Array Acceleration of Spiking Neural Networks with Application-Independent Split-Time Temporal Coding

Lee, Jeong-Jun; Li, Peng (August 2024, 2024 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED))

Spiking Neural Networks (SNNs) are brain-inspired computing models with event-driven based low-power operations and unique temporal dynamics. However, spatial and temporal dynamics in SNNs pose a significant overhead in accelerating neural computations and limit the computing capabilities of neuromorphic accelerators. Especially, unstructured sparsity emergent in both space and time, i.e., across neurons and time points, and iterative computations across time points cause a primary bottleneck in data movement. In this work, we propose a novel technique and architecture that allow the exploitation of temporal information compression with structured sparsity and parallelism across time, and significantly improves data movement on a systolic array. We split a full range of temporal domain into several time windows (TWs) where a TW packs multiple time points, and encode the temporal information in each TW with Split-Time Temporal coding (STT) by limiting the number of spikes within a TW up to one. STT enables sparsification and structurization of irregular firing activities and dramatically reduces computational overhead while delivering competitive classification accuracy without a huge drop. To further improve the data reuse, we propose an Integration Through Time (ITT) technique that processes integration steps across different TWs in parallel with a systolic array. The proposed architecture with STT and ITT offers an application-independent solution for spike-based models across various types of layers and networks. The proposed architecture delivers 97X latency and 78X energy efficiency improvements on average over a conventional SNN baseline on different benchmarks.
more » « less
Full Text Available
Composing recurrent spiking neural networks using locally-recurrent motifs and risk-mitigating architectural optimization

https://doi.org/10.3389/fnins.2024.1412559

Zhang, Wenrui; Geng, Hejia; Li, Peng (June 2024, Frontiers in Neuroscience)

In neural circuits, recurrent connectivity plays a crucial role in network function and stability. However, existing recurrent spiking neural networks (RSNNs) are often constructed by random connections without optimization. While RSNNs can produce rich dynamics that are critical for memory formation and learning, systemic architectural optimization of RSNNs is still an open challenge. We aim to enable systematic design of large RSNNs via a new scalable RSNN architecture and automated architectural optimization. We compose RSNNs based on a layer architecture called Sparsely-Connected Recurrent Motif Layer (SC-ML) that consists of multiple small recurrent motifs wired together by sparse lateral connections. The small size of the motifs and sparse inter-motif connectivity leads to an RSNN architecture scalable to large network sizes. We further propose a method called Hybrid Risk-Mitigating Architectural Search (HRMAS) to systematically optimize the topology of the proposed recurrent motifs and SC-ML layer architecture. HRMAS is an alternating two-step optimization process by which we mitigate the risk of network instability and performance degradation caused by architectural change by introducing a novel biologically-inspired “self-repairing” mechanism through intrinsic plasticity. The intrinsic plasticity is introduced to the second step of each HRMAS iteration and acts as unsupervised fast self-adaptation to structural and synaptic weight modifications introduced by the first step during the RSNN architectural “evolution.” We demonstrate that the proposed automatic architecture optimization leads to significant performance gains over existing manually designed RSNNs: we achieve 96.44% on TI46-Alpha, 94.66% on N-TIDIGITS, 90.28% on DVS-Gesture, and 98.72% on N-MNIST. To the best of the authors' knowledge, this is the first work to perform systematic architecture optimization on RSNNs.
more » « less
Full Text Available

Search for: All records