CURE: A High-Performance, Low-Power, and Reliable Network-on-Chip Design Using Reinforcement Learning

Wang, Ke; Louri, Ahmed

doi:10.1109/TPDS.2020.2986297

Citation Details

CURE: A High-Performance, Low-Power, and Reliable Network-on-Chip Design Using Reinforcement Learning

We propose CURE, a deep reinforcement learning (DRL)-based NoC design framework that simultaneously reduces network latency, improves energy-efficiency, and tolerates transient errors and permanent faults. CURE has several architectural innovations and a DRL-based hardware controller to manage design complexity and optimize trade-offs. First, in CURE, we propose reversible multi-function adaptive channels (RMCs) to reduce NoC power consumption and network latency. Second, we implement a new fault-secure adaptive error correction hardware in each router to enhance reliability for both transient errors and permanent faults. Third, we propose a router power-gating and bypass design that powers off NoC components to reduce power and extend chip lifespan. Further, for the complex dynamic interactions of these techniques, we propose using DRL to train a proactive control policy to provide improved fault-tolerance, reduced power consumption, and improved performance. Simulation using the PARSEC benchmark shows that CURE reduces end-to-end packet latency by 39%, improves energy efficiency by 92%, and lowers static and dynamic power consumption by 24% and 38%, respectively, over conventional solutions. Using mean-time-to-failure, we show that CURE is 7.7x more reliable than the conventional NoC design. more »

Award ID(s):: 1812495 1702980

PAR ID:: 10147076

Author(s) / Creator(s):: Wang, Ke; Louri, Ahmed

Date Published:: 2020-04-08

Journal Name:: IEEE Transactions on Parallel and Distributed Systems

ISSN:: 1045-9219

Page Range / eLocation ID:: 1 to 1

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1109/TPDS.2020.2986297

More Like this