LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators

Nallathambi, Abinand; Bose, Christin David; Haensch, Wilfried; Raghunathan, Anand

doi:10.3389/frai.2024.1268317

Citation Details

LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators

In-memory computing (IMC) with non-volatile memories (NVMs) has emerged as a promising approach to address the rapidly growing computational demands of Deep Neural Networks (DNNs). Mapping DNN layers spatially onto NVM-based IMC accelerators achieves high degrees of parallelism. However, two challenges that arise in this approach are the highly non-uniform distribution of layer processing times and high area requirements. We propose LRMP, a method to jointly apply layer replication and mixed precision quantization to improve the performance of DNNs when mapped to area-constrained IMC accelerators. LRMP uses a combination of reinforcement learning and mixed integer linear programming to search the replication-quantization design space using a model that is closely informed by the target hardware architecture. Across five DNN benchmarks, LRMP achieves 2.6–9.3× latency and 8–18× throughput improvement at minimal (<1%) degradation in accuracy. more »

Award ID(s):: 2107011

PAR ID:: 10663209

Author(s) / Creator(s):: Nallathambi, Abinand; Bose, Christin David; Haensch, Wilfried; Raghunathan, Anand

Publisher / Repository:: Frontiers in Artificial Intelligence

Date Published:: 2024-10-04

Journal Name:: Frontiers in Artificial Intelligence

Volume:: 7

ISSN:: 2624-8212

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Journal Article:
https://doi.org/10.3389/frai.2024.1268317

More Like this