NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fast and Fair Medical AI on the Edge Through Neural Architecture Search for Hybrid Vision Models

https://doi.org/10.1109/ICCAD57390.2023.10323652

Yang, Changdi; Sheng, Yi; Dong, Peiyan; Kong, Zhenglun; Li, Yanyu; Yu, Pinrui; Yang, Lei; Lin, Xue; Wang, Yanzhi (October 2023, IEEE)

Full Text Available
Machine Learning Across Network-Connected FPGAs

https://doi.org/10.1109/HPEC58863.2023.10363454

Diaconu, Dana; Xie, Yanyue; Gungor, Mehmet; Handagala, Suranga; Lin, Xue; Leeser, Miriam (September 2023, IEEE)

Full Text Available
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

https://doi.org/10.1109/HPCA56546.2023.10071047

Dong, Peiyan; Sun, Mengshu; Lu, Alec; Xie, Yanyue; Liu, Kenneth; Kong, Zhenglun; Meng, Xin; Li, Zhengang; Lin, Xue; Fang, Zhenman; et al (February 2023, 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA))

Full Text Available
ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training

Chang, Sung-En; Yuan, Geng; Lu, Alec; Sun, Mengshu; Li, Yanyu; Ma, Xiaolong; Li, Zhengang; Xie, Yanyue; Qin, Minghai; Lin, Xue; et al (January 2023, Design, Automation & Test in Europe Conference & Exhibition (DATE))

Full Text Available
Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

https://doi.org/10.1109/FPL57034.2022.00027

Li, Zhengang; Sun, Mengshu; Lu, Alec; Ma, Haoyu; Yuan, Geng; Xie, Yanyue; Tang, Hao; Li, Yanyu; Leeser, Miriam; Wang, Zhangyang; et al (August 2022, 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL))

Full Text Available
Late Breaking Results: FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

Sun, Mengshu; Li, Zhengang; Lu, Alec; Ma, Haoyu; Yuan, Geng; Xie, Yanyue; Tang, Hao; Li, Yanyu; Leeser, Miriam; Wang, Zhangyang; et al (January 2022, Proceedings of the 59th Design Automation Conference (DAC))

Full Text Available
FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization

https://doi.org/10.1145/3490422.3502364

Sun, Mengshu; Li, Zhengang; Lu, Alec; Li, Yanyu; Chang, Sung-En; Ma, Xiaolong; Lin, Xue (January 2022, Proceedings of the 30th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA))

Full Text Available
NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

Li, Zhengang; Yuan, Geng; Niu, Wei; Zhao, Pu; Li, Yanyu; Cai, Yuxuan; Shen, Xuan; Zhan, Zheng; Kong, Zhenglun; Jin, Qing; et al (June 2021, IEEE Conference on Computer Vision and Pattern Recognition)
null (Ed.)
Full Text Available
Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform?

https://doi.org/10.1109/TNNLS.2021.3063265

Ma, Xiaolong; Lin, Sheng; Ye, Shaokai; He, Zhezhi; Zhang, Linfeng; Yuan, Geng; Tan, Sia Huat; Li, Zhengang; Fan, Deliang; Qian, Xuehai; et al (March 2021, IEEE Transactions on Neural Networks and Learning Systems)
null (Ed.)
Full Text Available
ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA

Chang, Sung-En; Li, Yanyu; Sun, Mengshu; Wang, Yanzhi; Lin, Xue (February 2021, The Fifth Workshop on Cognitive Architectures (CogArch 2021))

This work targets the commonly used FPGA (field-programmable gate array) devices as the hardware platform for DNN edge computing. We focus on DNN quantization as the main model compression technique. The novelty of this work is: We use a quantization method that supports multiple precisions along the intra-layer dimension, while the existing quantization methods apply multi-precision quantization along the inter-layer dimension. The intra-layer multi-precision method can uniform the hardware configurations for different layers to reduce computation overhead and at the same time preserve the model accuracy as the inter-layer approach. Our proposed ILMPQ DNN quantization framework achieves 70.73% Top1 accuracy in ResNet-18 on the ImageNet dataset. We also validate the proposed MSP framework on two FPGA devices i.e., Xilinx XC7Z020 and XC7Z045. We achieve 3.65× speedup in end-to-end inference time on the ImageNet, comparing with the fixed-point quantization method.
more » « less
Full Text Available

« Prev Next »

Search for: All records