CASH: compiler assisted hardware design for improving DRAM energy efficiency in CNN inference

Sarma, Anup; Jiang, Huaipan; Pattnaik, Ashutosh; Kotra, Jagadish; Kandemir, Mahmut Taylan; Das, Chita R.

doi:10.1145/3357526.3357536

Citation Details

CASH: compiler assisted hardware design for improving DRAM energy efficiency in CNN inference

The advent of machine learning (ML) and deep learning applications has led to the development of a multitude of hardware accelerators and architectural optimization techniques for parallel architectures. This is due in part to the regularity and parallelism exhibited by the ML workloads, especially convolutional neural networks (CNNs). However, CPUs continue to be one of the dominant compute fabric in datacenters today, thereby also being widely deployed for inference tasks. As CNNs grow larger, the inherent limitations of a CPU-based system become apparent, specifically in terms of main memory data movement. In this paper, we present CASH, a compiler-assisted hardware solution that eliminates redundant data-movement to and from the main memory and, therefore, reduces main memory bandwidth and energy consumption. Our experimental evaluations on a set of four different state-of-the-art CNN workloads indicate that CASH provides, on average, ~40% and ~18% reductions in main memory bandwidth and energy consumption, respectively. more »

Award ID(s):: 1763681

PAR ID:: 10170704

Author(s) / Creator(s):: Sarma, Anup; Jiang, Huaipan; Pattnaik, Ashutosh; Kotra, Jagadish; Kandemir, Mahmut Taylan; Das, Chita R.

Date Published:: 2019-01-01

Journal Name:: International Symposium on Memory Systems

Page Range / eLocation ID:: 396 to 407

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3357526.3357536

More Like this