NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs

https://doi.org/10.1109/FPL64840.2024.00044

Yang, Geng; Xie, Yanyue; Xue, Zhong Jia; Chang, Sung-En; Li, Yanyu; Dong, Peiyan; Lei, Jie; Xie, Weiying; Wang, Yanzhi; Lin, Xue; et al (September 2024, IEEE)

Full Text Available
Digital Avatars: Framework Development and Their Evaluation

https://doi.org/10.24963/ijcai.2024/1031

Rupprecht, Timothy; Chang, Sung-En; Wu, Yushu; Lu, Lei; Nan, Enfu; Li, Chih-hsiang; Lai, Caiyue; Li, Zhimin; Hu, Zhijun; He, Yumei; et al (August 2024, International Joint Conferences on Artificial Intelligence Organization)

We present a novel prompting strategy for artificial intelligence driven digital avatars. To better quantify how our prompting strategy affects anthropomorphic features like humor, authenticity, and favorability we present Crowd Vote - an adaptation of Crowd Score that allows for judges to elect a large language model (LLM) candidate over competitors answering the same or similar prompts. To visualize the responses of our LLM, and the effectiveness of our prompting strategy we propose an end-to-end framework for creating high-fidelity artificial intelligence (AI) driven digital avatars. This pipeline effectively captures an individual's essence for interaction and our streaming algorithm delivers a high-quality digital avatar with real-time audio-video streaming from server to mobile device. Both our visualization tool, and our Crowd Vote metrics demonstrate our AI driven digital avatars have state-of-the-art humor, authenticity, and favorability outperforming all competitors and baselines. In the case of our Donald Trump and Joe Biden avatars, their authenticity and favorability are rated higher than even their real-world equivalents.
more » « less
Full Text Available
SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits

Xie, Yanyue; Dong, Peiyan; Yuan, Geng; Li, Zhengang; Zabihi, Masoud; Wu, Chao; Chang, Sung-En; Zhang, Xufeng; Lin, Xue; Ding, Caiwen; et al (March 2024, 2024 Design, Automation & Test in Europe Conference)

Full Text Available
You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding

https://doi.org/10.1007/978-3-031-19775-8_3

Yuan, Geng; Chang, Sung-en; Jin, Qing; Lu, Alec; Li, Yanyu; Wu, Yushu; et al. (October 2022, European Conference on Computer Vision (ECCV), 2022.)
ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training

https://doi.org/10.23919/DATE56975.2023.10137222

Chang, Sung-En; Yuan, Geng; Lu, Alec; Sun, Mengshu; Li, Yanyu; Ma, Xiaolong; Li, Zhengang; Xie, Yanyue; Qin, Minghai; Lin, Xue; et al (April 2023, 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE))
ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training

Chang, Sung-En; Yuan, Geng; Lu, Alec; Sun, Mengshu; Li, Yanyu; Ma, Xiaolong; Li, Zhengang; Xie, Yanyue; Qin, Minghai; Lin, Xue; et al (January 2023, Design, Automation & Test in Europe Conference & Exhibition (DATE))

Full Text Available
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding

Yuan, Geng; Chang, Sung-En; Jin, Qing; Lu, Alec; Li, Yanyu; Wu, Yushu; Kong, Zhenglun; Xie, Yanyue; Dong, Peiyan; Qin, Minghai; et al (October 2022, European Conference on Computer Vision (ECCV))

Full Text Available
FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization

https://doi.org/10.1145/3490422.3502364

Sun, Mengshu; Li, Zhengang; Lu, Alec; Li, Yanyu; Chang, Sung-En; Ma, Xiaolong; Lin, Xue (January 2022, Proceedings of the 30th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA))

Full Text Available
ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA

Chang, Sung-En; Li, Yanyu; Sun, Mengshu; Wang, Yanzhi; Lin, Xue (February 2021, The Fifth Workshop on Cognitive Architectures (CogArch 2021))

This work targets the commonly used FPGA (field-programmable gate array) devices as the hardware platform for DNN edge computing. We focus on DNN quantization as the main model compression technique. The novelty of this work is: We use a quantization method that supports multiple precisions along the intra-layer dimension, while the existing quantization methods apply multi-precision quantization along the inter-layer dimension. The intra-layer multi-precision method can uniform the hardware configurations for different layers to reduce computation overhead and at the same time preserve the model accuracy as the inter-layer approach. Our proposed ILMPQ DNN quantization framework achieves 70.73% Top1 accuracy in ResNet-18 on the ImageNet dataset. We also validate the proposed MSP framework on two FPGA devices i.e., Xilinx XC7Z020 and XC7Z045. We achieve 3.65× speedup in end-to-end inference time on the ImageNet, comparing with the fixed-point quantization method.
more » « less
Full Text Available
RMSMP: A Novel Deep Neural Network Quantization Framework With Row-Wise Mixed Schemes and Multiple Precisions

https://doi.org/10.1109/ICCV48922.2021.00520

Chang, Sung-En; Li, Yanyu; Sun, Mengshu; Jiang, Weiwen; Liu, Sijia; Wang, Yanzhi; Lin, Xue (January 2021, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV))

Full Text Available

« Prev Next »

Search for: All records