HAT-DRL: Hotspot-Aware Task Mapping for Lifetime Improvement of Multicore System using Deep Reinforcement Learning

J. Zhang, S. Sadiqbatcha

In this work, we proposed a novel learning-based task to core mapping technique to improve lifetime and reliability based on advanced deep reinforcement learning. The new method is based on the observation that on-chip temperature sensors may not capture the true hotspots of the chip, which can lead to sub-optimal control decisions. In the new method, we first perform data-driven learning to model the hotspot activation indicator with respect to the resource utilization of different workloads. On top of this, we proposed to employ a recently proposed, highly robust, sample-efficient soft-actor-critic deep reinforcement learning algorithm, which can learn optimal maximum entropy policies to improve the long-term reliability and minimize the performance degradation from NBTI/HCI effects. Lifetime and reliability improvement is achieved by assigning a reward function, which penalizes continuously stressing the same hotspots and encourages even stressing of cores. The proposed algorithm is validated with an Intel i7-8650U four-core CPU platform executing CPU benchmark workloads for various hotspot activation profiles. Our experimental results show that the proposed method balances the stress between all cores and hotspots, and achieves 50% and 160% longer lifetime compared to non-hotspot-aware and Linux default scheduling respectively. The proposed method can also reduce the average temperature by exploiting the true-hotspot information.

More Like this