We study the max-min fairness of multi-task jobs in distributed computing platforms. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. Each job is associated with a utility which is a decreasing function of its completion time, and captures how sensitive it is to latency. The objective is to schedule tasks in a way that achieves max-min fairness for jobs’ utilities, i.e., an optimal schedule in which any attempt to improve the utility of a job necessarily results in hurting the utility of some other job with smaller or equal utility.We first show a strong result regarding NP-hardness of finding the max-min fair vector of job utilities. The implication of this result is that achieving max-min fairness in many other distributed scheduling problems (e.g., coflow scheduling) is NP-hard. We then proceed to define two notions of approximation solutions: one based on finding a certain number of elements of the max-min fair vector, and the other based on a single-objective optimization whose solution gives the max-min fair vector. We develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks’ processing times. We verify the performance of our algorithms through extensive simulations, using a real traffic trace from a large Google cluster.
more »
« less
On Max-Min Fairness of Completion Times for Multi-Task Job Scheduling
We study the max-min fairness of multi-task jobs in distributed computing platforms. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. Each job is associated with a utility which is a decreasing function of its completion time, and captures how sensitive it is to latency. The objective is to schedule tasks in a way that achieves max-min fairness for jobs' utilities, i.e., an optimal schedule in which any attempt to improve the utility of a job necessarily results in hurting the utility of some other job with smaller or equal utility. We first show a strong result regarding NP-hardness of finding the max-min fair vector of job utilities. The implication of this result is that achieving max-min fairness in many other distributed scheduling problems (e.g., coflow scheduling) is NP-hard. We then proceed to define two notions of approximation solutions: one based on finding a certain number of elements of the max-min fair vector, and the other based on a single-objective optimization whose solution gives the max-min fair vector. We develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks' processing times. We verify the performance of our algorithms through extensive simulations, using a real traffic trace from a large Google cluster.
more »
« less
- PAR ID:
- 10249199
- Date Published:
- Journal Name:
- 2020 IFIP Networking Conference (Networking)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We first consider the static problem of allocating resources to (i.e., scheduling) multiple distributed application frameworks, possibly with different priorities and server preferences, in a private cloud with heterogeneous servers. Several fair scheduling mechanisms have been proposed for this purpose. We extend prior results on max-min fair (MMF) and proportional fair (PF) scheduling to this constrained multiresource and multiserver case for generic fair scheduling criteria. The task efficiencies (a metric related to proportional fairness) of max- min fair allocations found by progressive filling are compared by illustrative examples. In the second part of this paper, we consider the online problem (with framework churn) by implementing variants of these schedulers in Apache Mesos using progressive filling to dynamically approximate max-min fair allocations. We evaluate the implemented schedulers in terms of overall execution time of realistic distributed Spark workloads. Our experiments show that resource efficiency is improved and execution times are reduced when the scheduler is “server specific” or when it leverages characterized required resources of the workloads (when known).more » « less
-
We study fair mechanisms for the classic job scheduling problem on unrelated machines with the objective of minimizing the makespan. This problem is equivalent to minimizing the egalitarian social cost in the fair division of chores. The two prevalent fairness notions in the fair division literature are envy-freeness and proportionality. Prior work has established that no envy-free mechanism can provide better than an Ω(log m / log log m)-approximation to the optimal makespan, where m is the number of machines, even when payments to the machines are allowed. In strong contrast to this impossibility, our main result demonstrates that there exists a proportional mechanism (with payments) that achieves a 3/2-approximation to the optimal makespan, and this ratio is tight. To prove this result, we provide a full characterization of allocation functions that can be made proportional with payments. Furthermore, we show that for instances with normalized costs, there exists a proportional mechanism that achieves the optimal makespan. We conclude with important directions for future research concerning other fairness notions, including relaxations of envy-freeness. Notably, we show that the technique leading to the impossibility result for envy-freeness does not extend to its relaxations.more » « less
-
Motivated by cloud computing, we study a market-based approach for job scheduling on multiple machines where users have hard deadlines and prefer earlier completion times. In our model, completing a job provides a benefit equal to its present value, i.e., the value discounted to the time when the job finishes. Users submit job requirements to the cloud provider who non-preemptively schedules jobs to maximize the social welfare, i.e., the sum of present values of completed jobs. Using a simple and fast greedy algorithm, we obtain a 1+s/(s−1) approximation to the optimal schedule, where s>1 is the minimum ratio of a job’s deadline to processing time. Building on our approximation algorithm, we construct a pricing rule to incentivize users to truthfully report all job requirements.more » « less
-
Motivated by modern parallel computing applications, we consider the problem of scheduling parallel-task jobs with heterogeneous resource requirements in a cluster of machines. Each job consists of a set of tasks that can be processed in parallel; however, the job is considered completed only when all its tasks finish their processing, which we refer to as the synchronization constraint. Furthermore, assignment of tasks to machines is subject to placement constraints, that is, each task can be processed only on a subset of machines, and processing times can also be machine dependent. Once a task is scheduled on a machine, it requires a certain amount of resource from that machine for the duration of its processing. A machine can process (pack) multiple tasks at the same time; however, the cumulative resource requirement of the tasks should not exceed the machine’s capacity. Our objective is to minimize the weighted average of the jobs’ completion times. The problem, subject to synchronization, packing, and placement constraints, is NP-hard, and prior theoretical results only concern much simpler models. For the case that migration of tasks among the placement-feasible machines is allowed, we propose a preemptive algorithm with an approximation ratio of [Formula: see text]. In the special case that only one machine can process each task, we design an algorithm with an improved approximation ratio of four. Finally, in the case that migrations (and preemptions) are not allowed, we design an algorithm with an approximation ratio of 24. Our algorithms use a combination of linear program relaxation and greedy packing techniques. We present extensive simulation results, using a real traffic trace, that demonstrate that our algorithms yield significant gains over the prior approaches.more » « less
An official website of the United States government

