AxoNN: energy-aware execution of neural network inference on multi-accelerator heterogeneous SoCs

Dagli, Ismet; Cieslewicz, Alexander; McClurg, Jedidiah; Belviranli, Mehmet E.

doi:10.1145/3489517.3530572

Citation Details

AxoNN: energy-aware execution of neural network inference on multi-accelerator heterogeneous SoCs

The energy and latency demands of critical workload execution, such as object detection, in embedded systems vary based on the physical system state and other external factors. Many recent mobile and autonomous System-on-Chips (SoC) embed a diverse range of accelerators with unique power and performance characteristics. The execution flow of the critical workloads can be adjusted to span into multiple accelerators so that the trade-off between performance and energy fits to the dynamically changing physical factors. In this study, we propose running neural network (NN) inference on multiple accelerators of an SoC. Our goal is to enable an energy-performance trade-off with an by distributing layers in a NN between a performance- and a power-efficient accelerator. We first provide an empirical modeling methodology to characterize execution and inter-layer transition times. We then find an optimal layers-to-accelerator mapping by representing the trade-off as a linear programming optimization constraint. We evaluate our approach on the NVIDIA Xavier AGX SoC with commonly used NN models. We use the Z3 SMT solver to find schedules for different energy consumption targets, with up to 98% prediction accuracy. more »

Award ID(s):: 2124010

PAR ID:: 10358753

Author(s) / Creator(s):: Dagli, Ismet; Cieslewicz, Alexander; McClurg, Jedidiah; Belviranli, Mehmet E.

Date Published:: 2022-07-10

Journal Name:: DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Page Range / eLocation ID:: 1069 to 1074

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3489517.3530572

More Like this