MAPA: multi-accelerator pattern allocation policy for multi-tenant GPU servers

Ranganath, Kiran; Suetterlein, Joshua D.; Manzano, Joseph B.; Song, Shuaiwen Leon; Wong, Daniel

doi:10.1145/3458817.3480853

Citation Details

MAPA: multi-accelerator pattern allocation policy for multi-tenant GPU servers

Multi-accelerator servers are increasingly being deployed in shared multi-tenant environments (such as in cloud data centers) in order to meet the demands of large-scale compute-intensive workloads. In addition, these accelerators are increasingly being inter-connected in complex topologies and workloads are exhibiting a wider variety of inter-accelerator communication patterns. However, existing allocation policies are ill-suited for these emerging use-cases. Specifically, this work identifies that multi-accelerator workloads are commonly fragmented leading to reduced bandwidth and increased latency for inter-accelerator communication. We propose Multi-Accelerator Pattern Allocation (MAPA), a graph pattern mining approach towards providing generalized allocation support for allocating multi-accelerator workloads on multi-accelerator servers. We demonstrate that MAPA is able to improve the execution time of multi-accelerator workloads and that MAPA is able to provide generalized benefits across various accelerator topologies. Finally, we demonstrate a speedup of 12.4% for 75th percentile of jobs with the worst case execution time reduced by up to 35% against baseline policy using MAPA. more »

Award ID(s):: 2047521

PAR ID:: 10319195

Author(s) / Creator(s):: Ranganath, Kiran; Suetterlein, Joshua D.; Manzano, Joseph B.; Song, Shuaiwen Leon; Wong, Daniel

Date Published:: 2021-11-13

Journal Name:: SC'21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3458817.3480853

More Like this