NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Latent Weight-Based Pruning for Small Binary Neural Networks

https://doi.org/10.1145/3566097.3567873

Chen, Tianen; Anderson, Noah; Kim, Younghyun (January 2023, ASPDAC '23: Proceedings of the 28th Asia and South Pacific Design Automation Conference)

Binary neural networks (BNNs) substitute complex arithmetic operations with simple bit-wise operations. The binarized weights and activations in BNNs can drastically reduce memory requirement and energy consumption, making it attractive for edge ML applications with limited resources. However, the severe memory capacity and energy constraints of low-power edge devices call for further reduction of BNN models beyond binarization. Weight pruning is a proven solution for reducing the size of many neural network (NN) models, but the binary nature of BNN weights make it difficult to identify insignificant weights to remove. In this paper, we present a pruning method based on latent weight with layer-level pruning sensitivity analysis which reduces the over-parameterization of BNNs, allowing for accuracy gains while drastically reducing the model size. Our method advocates for a heuristics that distinguishes weights by their latent weights, a real-valued vector used to compute the pseudogradient during backpropagation. It is tested using three different convolutional NNs on the MNIST, CIFAR-10, and Imagenette datasets with results indicating a 33%--46% reduction in operation count, with no accuracy loss, improving upon previous works in accuracy, model size, and total operation count.
more » « less
Full Text Available
SynthNet: A High-throughput yet Energy-efficient Combinational Logic Neural Network

https://doi.org/10.1109/ASP-DAC52403.2022.9712554

Chen, Tianen; Kemp, Taylor; Kim, Younghyun (January 2022, 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC))

Full Text Available
Approximate Hardware Techniques for Energy-Quality Scaling Across the System

https://doi.org/10.1109/ICEIC49074.2020.9051208

Kim, Younghyun; Miguel, Joshua San; Behroozi, Setareh; Chen, Tianen; Lee, Kyuin; Lee, Yongwoo; Li, Jingjie; Wu, Di (January 2020, 2020 International Conference on Electronics, Information, and Communication (ICEIC))

For error-resilient applications, such as machine learning and signal processing, a significant improvement in energy efficiency can be achieved by relaxing exactness constraint on output quality. This paper presents a taxonomy of hardware techniques to exploit the trade-off between energy efficiency and quality in various computer subsystems. We classify approximate hardware techniques according to target subsystem and support for dynamic energy-quality scaling.
more » « less
Full Text Available
SECO: A Scalable Accuracy Approximate Exponential Function Via Cross-Layer Optimization

https://doi.org/10.1109/ISLPED.2019.8824959

Wu, Di; Chen, Tianen; Chen, Chienfu; Ahia, Oghenefego; Miguel, Joshua San; Lipasti, Mikko; Kim, Younghyun (July 2019, 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED))

From signal processing to emerging deep neural networks, a range of applications exhibit intrinsic error resilience. For such applications, approximate computing opens up new possibilities for energy-efficient computing by producing slightly inaccurate results using greatly simplified hardware. Adopting this approach, a variety of basic arithmetic units, such as adders and multipliers, have been effectively redesigned to generate approximate results for many error-resilient applications.In this work, we propose SECO, an approximate exponential function unit (EFU). Exponentiation is a key operation in many signal processing applications and more importantly in spiking neuron models, but its energy-efficient implementation has been inadequately explored. We also introduce a cross-layer design method for SECO to optimize the energy-accuracy trade-off. At the algorithm level, SECO offers runtime scaling between energy efficiency and accuracy based on approximate Taylor expansion, where the error is minimized by optimizing parameters using discrete gradient descent at design time. At the circuit level, our error analysis method efficiently explores the design space to select the energy-accuracy-optimal approximate multiplier at design time. In tandem, the cross-layer design and runtime optimization method are able to generate energy-efficient and accurate approximate EFU designs that are up to 99.7% accurate at a power consumption of 3.73 pJ per exponential operation. SECO is also evaluated on the adaptive exponential integrate-and-fire neuron model, yielding only 0.002% timing error and 0.067% value error compared to the precise neuron model.
more » « less
Full Text Available

Search for: All records