skip to main content


Title: Load Balancing Under Strict Compatibility Constraints
Consider a system with N identical single-server queues and a number of task types, where each server is able to process only a small subset of possible task types. Arriving tasks select [Formula: see text] random compatible servers and join the shortest queue among them. The compatibility constraints are captured by a fixed bipartite graph between the servers and the task types. When the graph is complete bipartite, the mean-field approximation is accurate. However, such dense compatibility graphs are infeasible for large-scale implementation. We characterize a class of sparse compatibility graphs for which the mean-field approximation remains valid. For this, we introduce a novel notion, called proportional sparsity, and establish that systems with proportionally sparse compatibility graphs asymptotically match the performance of a fully flexible system. Furthermore, we show that proportionally sparse random compatibility graphs can be constructed, which reduce the server degree almost by a factor [Formula: see text] compared with the complete bipartite compatibility graph.  more » « less
Award ID(s):
2113027
NSF-PAR ID:
10323989
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Mathematics of Operations Research
ISSN:
0364-765X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Applications in cloud platforms motivate the study of efficient load balancing under job-server constraints and server heterogeneity. In this paper, we study load balancing on a bipartite graph where left nodes correspond to job types and right nodes correspond to servers, with each edge indicating that a job type can be served by a server. Thus edges represent locality constraints, i.e., an arbitrary job can only be served at servers which contain certain data and/or machine learning (ML) models. Servers in this system can have heterogeneous service rates. In this setting, we investigate the performance of two policies named Join-the-Fastest-of-the-Shortest-Queue (JFSQ) and Join-the-Fastest-of-the-Idle-Queue (JFIQ), which are simple variants of Join-the-Shortest-Queue and Join-the-Idle-Queue, where ties are broken in favor of the fastest servers. Under a "well-connected'' graph condition, we show that JFSQ and JFIQ are asymptotically optimal in the mean response time when the number of servers goes to infinity. In addition to asymptotic optimality, we also obtain upper bounds on the mean response time for finite-size systems. We further show that the well-connectedness condition can be satisfied by a random bipartite graph construction with relatively sparse connectivity. 
    more » « less
  2. We compute the eigenvalue fluctuations of uniformly distributed random biregular bipartite graphs with fixed and growing degrees for a large class of analytic functions. As a key step in the proof, we obtain a total variation distance bound for the Poisson approximation of the number of cycles and cyclically non-backtracking walks in random biregular bipartite graphs, which might be of independent interest. We also prove a semicircle law for random [Formula: see text]-biregular bipartite graphs when [Formula: see text]. As an application, we translate the results to adjacency matrices of uniformly distributed random regular hypergraphs. 
    more » « less
  3. The fundamental problem in the study of parallel-server systems is that of finding and analyzing routing policies of arriving jobs to the servers that efficiently balance the load on the servers. The most well-studied policies are (in decreasing order of efficiency) join the shortest workload (JSW), which assigns arrivals to the server with the least workload; join the shortest queue (JSQ), which assigns arrivals to the smallest queue; the power-of-[Formula: see text] (PW([Formula: see text])), which assigns arrivals to the shortest among [Formula: see text] queues that are sampled from the total of [Formula: see text] queues uniformly at random; and uniform routing, under which arrivals are routed to one of the [Formula: see text] queues uniformly at random. In this paper we study the stability problem of parallel-server systems, assuming that routing errors may occur, so that arrivals may be routed to the wrong queue (not the smallest among the relevant queues) with a positive probability. We treat this routing mechanism as a probabilistic routing policy, named a [Formula: see text]-allocation policy, that generalizes the PW([Formula: see text]) policy, and thus also the JSQ and uniform routing, where [Formula: see text] is an [Formula: see text]-dimensional vector whose components are the routing probabilities. Our goal is to study the (in)stability problem of the system under this routing mechanism, and under its “nonidling” version, which assigns new arrivals to an idle server, if such a server is available, and otherwise routes according to the [Formula: see text]-allocation rule. We characterize a sufficient condition for stability, and prove that the stability region, as a function of the system’s primitives and [Formula: see text], is in general smaller than the set [Formula: see text]. Our analyses build on representing the queue process as a continuous-time Markov chain in an ordered space of [Formula: see text]-dimensional real-valued vectors, and using a generalized form of the Schur-convex order. 
    more » « less
  4. Consider a system of homogeneous interacting diffusive particles labeled by the nodes of a unimodular Galton–Watson tree, where the state of each node evolves infinitesi- mally like a d-dimensional diffusion whose drift coefficient depends on (the histories of) its own state and the states of neighboring nodes, and whose diffusion coefficient depends only on (the history of) its own state. Under suitable regularity assumptions on the coefficients, an autonomous characterization is obtained for the marginal dis- tribution of the dynamics of the neighborhood of a typical node in terms of a certain local equation, which is a new kind of stochastic differential equation that is nonlinear in the sense of McKean. This equation describes a finite-dimensional non-Markovian stochastic process whose infinitesimal evolution at any time depends not only on the structure and current state of the neighborhood, but also on the conditional law of the current state given the past of the states of neighborhing nodes until that time. Such marginal distributions are of interest because they arise as weak limits of both marginal distributions and empirical measures of interacting diffusions on many sequences of sparse random graphs, including the configuration model and Erdös–Rényi graphs whose average degrees converge to a finite non-zero limit. The results obtained complement classical results in the mean-field regime, which characterize the limiting dynamics of homogeneous interacting diffusions on complete graphs, as the num- ber of nodes goes to infinity, in terms of a corresponding nonlinear Markov process. However, in the sparse graph setting, the topology of the graph strongly influences the dynamics, and the analysis requires a completely different approach. The proofs of existence and uniqueness of the local equation rely on delicate new conditional independence and symmetry properties of particle trajectories on unimodular Galton– Watson trees, as well as judicious use of changes of measure. 
    more » « less
  5. Ride-pooling, which accommodates multiple passenger requests in a single trip, has the potential to substantially enhance the throughput of mobility-on-demand (MoD) systems. This paper investigates MoD systems that operate mixed fleets composed of “basic supply” and “augmented supply” vehicles. When the basic supply is insufficient to satisfy demand, augmented supply vehicles can be repositioned to serve rides at a higher operational cost. We formulate the joint vehicle repositioning and ride-pooling assignment problem as a two-stage stochastic integer program, where repositioning augmented supply vehicles precedes the realization of ride requests. Sequential ride-pooling assignments aim to maximize total utility or profit on a shareability graph: a hypergraph representing the matching compatibility between available vehicles and pending requests. Two approximation algorithms for midcapacity and high-capacity vehicles are proposed in this paper; the respective approximation ratios are [Formula: see text] and [Formula: see text], where p is the maximum vehicle capacity plus one. Our study evaluates the performance of these approximation algorithms using an MoD simulator, demonstrating that these algorithms can parallelize computations and achieve solutions with small optimality gaps (typically within 1%). These efficient algorithms pave the way for various multimodal and multiclass MoD applications.

    History: This paper has been accepted for the Transportation Science Special Issue on Emerging Topics in Transportation Science and Logistics.

    Funding: This work was supported by the National Science Foundation [Grants CCF-2006778 and FW-HTF-P 2222806], the Ford Motor Company, and the Division of Civil, Mechanical, and Manufacturing Innovation [Grants CMMI-1854684, CMMI-1904575, and CMMI-1940766].

    Supplemental Material: The online appendix is available at https://doi.org/10.1287/trsc.2021.0349 .

     
    more » « less