HDRLPIM: A Simulator for Hyper-Dimensional Reinforcement Learning Based on Processing In-Memory

Rakka, Mariam; Amer, Walaa; Chen, Hanning; Imani, Mohsen; Kurdahi, Fadi

doi:10.1145/3695875

Citation Details

HDRLPIM: A Simulator for Hyper-Dimensional Reinforcement Learning Based on Processing In-Memory

Processing In-Memory (PIM) is a data-centric computation paradigm that performs computations inside the memory, hence eliminating the memory wall problem in traditional computational paradigms used in Von-Neumann architectures. The associative processor, a type of PIM architecture, allows performing parallel and energy-efficient operations on vectors. This architecture is found useful in vector-based applications such as Hyper-Dimensional (HDC) Reinforcement Learning (RL). HDC is rising as a new powerful and lightweight alternative to costly traditional RL models such as Deep Q-Learning. The HDC implementation of Q-Learning relies on encoding the states in a high-dimensional representation where calculating Q-values and finding the maximum one can be done entirely in parallel. In this article, we propose to implement the main operations of a HDC RL framework on the associative processor. This acceleration achieves up to\(152.3\times\)and\(6.4\times\)energy and time savings compared to an FPGA implementation. Moreover, HDRLPIM shows that an SRAM-based AP implementation promises up to\(968.2\times\)energy-delay product gains compared to the FPGA implementation. more »