Optimal Load Balancing with Locality Constraints

Weng, Wentao; Zhou, Xingyu; Srikant, R.

doi:10.1145/3428330

Citation Details

Optimal Load Balancing with Locality Constraints

Applications in cloud platforms motivate the study of efficient load balancing under job-server constraints and server heterogeneity. In this paper, we study load balancing on a bipartite graph where left nodes correspond to job types and right nodes correspond to servers, with each edge indicating that a job type can be served by a server. Thus edges represent locality constraints, i.e., an arbitrary job can only be served at servers which contain certain data and/or machine learning (ML) models. Servers in this system can have heterogeneous service rates. In this setting, we investigate the performance of two policies named Join-the-Fastest-of-the-Shortest-Queue (JFSQ) and Join-the-Fastest-of-the-Idle-Queue (JFIQ), which are simple variants of Join-the-Shortest-Queue and Join-the-Idle-Queue, where ties are broken in favor of the fastest servers. Under a "well-connected'' graph condition, we show that JFSQ and JFIQ are asymptotically optimal in the mean response time when the number of servers goes to infinity. In addition to asymptotic optimality, we also obtain upper bounds on the mean response time for finite-size systems. We further show that the well-connectedness condition can be satisfied by a random bipartite graph construction with relatively sparse connectivity. more »

Award ID(s):: 1739189 1934986

PAR ID:: 10298041

Author(s) / Creator(s):: Weng, Wentao; Zhou, Xingyu; Srikant, R.

Date Published:: 2020-11-30

Journal Name:: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Volume:: 4

Issue:: 3

ISSN:: 2476-1249

Page Range / eLocation ID:: 1 to 37

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3428330

More Like this