Search for: All records

Creators/Authors contains: "Whatmough, Paul"

« Prev Next »

Total Resources

8

Resource Type
Conference Paper

6

Conference Proceeding

0

Dataset

0

Journal Article

2

Workshop Report

0

Availability
Full Text / Resource Available

8

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators

Tyagi, Abhishek ; Gan, Yiming ; Liu, Shaoshan ; Yu, Bo ; Whatmough, Paul ; Zhu, Yuhao ( March 2023 , IEEE International Symposium on High-Performance Computer Architecture)

Full Text Available
22.9 A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management

https://doi.org/10.1109/ISSCC42615.2023.10067817

Tambe, Thierry ; Zhang, Jeff ; Hooper, Coleman ; Jia, Tianyu ; Whatmough, Paul N. ; Zuckerman, Joseph ; Santos, Maico Cassel ; Loscalzo, Erik Jens ; Giri, Davide ; Shepard, Kenneth ; et al ( February 2023 , 2023 IEEE International Solid- State Circuits Conference (ISSCC))

Large language models have substantially advanced nuance and context understanding in natural language processing (NLP), further fueling the growth of intelligent conversational interfaces and virtual assistants. However, their hefty computational and memory demands make them potentially expensive to deploy on cloudless edge platforms with strict latency and energy requirements. For example, an inference pass using the state-of-the-art BERT-base model must serially traverse through 12 computationally intensive transformer layers, each layer containing 12 parallel attention heads whose outputs concatenate to drive a large feed-forward network. To reduce computation latency, several algorithmic optimizations have been proposed, e.g., a recent algorithm dynamically matches linguistic complexity with model sizes via entropy-based early exit. Deploying such transformer models on edge platforms requires careful co-design and optimizations from algorithms to circuits, where energy consumption is a key design consideration.
more » « less
Full Text Available
A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs

https://doi.org/10.1109/JSSC.2022.3179303

Tambe, Thierry ; Yang, En-Yu ; Ko, Glenn G. ; Chai, Yuji ; Hooper, Coleman ; Donato, Marco ; Whatmough, Paul N. ; Rush, Alexander M. ; Brooks, David ; Wei, Gu-Yeon ( June 2022 , IEEE Journal of Solid-State Circuits)

Full Text Available
SMIV: A 16-nm 25-mm² SoC for IoT With Arm Cortex-A53, eFPGA, and Coherent Accelerators

https://doi.org/10.1109/JSSC.2021.3115466

Lee, Sae Kyu ; Whatmough, Paul N. ; Donato, Marco ; Ko, Glenn G. ; Brooks, David ; Wei, Gu-Yeon ( February 2022 , IEEE Journal of Solid-State Circuits)

Full Text Available
FixyFPGA: Efficient FPGA Accelerator for Deep Neural Networks with High Element-Wise Sparsity and without External Memory Access

https://doi.org/10.1109/FPL53798.2021.00010

Meng, Jian ; Venkataramanaiah, Shreyas Kolala ; Zhou, Chuteng ; Hansen, Patrick ; Whatmough, Paul ; Seo, Jae-sun ( August 2021 , 2021 31st International Conference on Field-Programmable Logic and Applications (FPL))

Full Text Available
9.8 A 25mm 2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET

https://doi.org/10.1109/ISSCC42613.2021.9366062

Tambe, Thierry ; Yang, En-Yu ; Ko, Glenn G. ; Chai, Yuji ; Hooper, Coleman ; Donato, Marco ; Whatmough, Paul N. ; Rush, Alexander M. ; Brooks, David ; Wei, Gu-Yeon ( February 2021 , International Solid-State Circuits Conference)

Full Text Available
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference

https://doi.org/10.1145/3466752.3480095

Tambe, Thierry ; Hooper, Coleman ; Pentecost, Lillian ; Jia, Tianyu ; Yang, En-Yu ; Donato, Marco ; Sanh, Victor ; Whatmough, Paul ; Rush, Alexander M. ; Brooks, David ; et al ( October 2021 , MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture)

Full Text Available
A 16nm 25mm 2 SoC with a 54.5x Flexibility-Efficiency Range from Dual-Core Arm Cortex-A53 to eFPGA and Cache-Coherent Accelerators

https://doi.org/10.23919/VLSIC.2019.8778002

Whatmough, Paul N. ; Lee, Sae Kyu ; Donato, Marco ; Hsueh, Hsea-Ching ; Xi, Sam Likun ; Gupta, Udit ; Pentecost, Lillian ; Ko, Glenn G. ; Brooks, David ; Wei, Gu-Yeon ( June 2019 , 2019 Symposium on VLSI Circuits)

Full Text Available