Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File

Mittal, Sparsh; Wang, Haonan; Jog, Adwait; Vetter, Jeffrey S.

doi:10.1109/VLSID.2017.14

Citation Details

Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File

Modern graphics processing units (GPUs) are using increasingly larger register file (RF) which occupies a large fraction of GPU core area and is very frequently accessed. This makes RF vulnerable to soft-errors (SE). In this paper, we present two techniques for improving SE resilience of GPU RF. First, we propose compressing the RF values for reducing the number of vulnerable bits. We leverage value similarity and the presence of narrow-width values to perform compression at warp or thread-level, respectively. Second, we propose selective hardening to design a portion of register entry with SE immune circuits. By collectively using these techniques, higher resilience can be provided with lower overhead. Without hardening, our warp and thread-level compression techniques bring 47.0% and 40.8% reduction in SE vulnerability, respectively. more »

Award ID(s):: 1657336

PAR ID:: 10048266

Author(s) / Creator(s):: Mittal, Sparsh; Wang, Haonan; Jog, Adwait; Vetter, Jeffrey S.

Date Published:: 2017-01-01

Journal Name:: International Conference on VLSI Design and International Conference on Embedded Systems (VLSID)

Page Range / eLocation ID:: 409 to 414

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/VLSID.2017.14

More Like this