skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Bag-Of-Tasks Scheduling on Related Machines
We consider online scheduling to minimize weighted completion time on related machines, where each job consists of several tasks that can be concurrently executed. A job gets completed when all its component tasks finish. We obtain an O(K³ log² K)-competitive algorithm in the non-clairvoyant setting, where K denotes the number of distinct machine speeds. The analysis is based on dual-fitting on a precedence-constrained LP relaxation that may be of independent interest.  more » « less
Award ID(s):
2006953 1955785 1907820
PAR ID:
10327020
Author(s) / Creator(s):
; ;
Editor(s):
Wootters, Mary and
Date Published:
Journal Name:
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the max-min fairness of multi-task jobs in distributed computing platforms. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. Each job is associated with a utility which is a decreasing function of its completion time, and captures how sensitive it is to latency. The objective is to schedule tasks in a way that achieves max-min fairness for jobs' utilities, i.e., an optimal schedule in which any attempt to improve the utility of a job necessarily results in hurting the utility of some other job with smaller or equal utility. We first show a strong result regarding NP-hardness of finding the max-min fair vector of job utilities. The implication of this result is that achieving max-min fairness in many other distributed scheduling problems (e.g., coflow scheduling) is NP-hard. We then proceed to define two notions of approximation solutions: one based on finding a certain number of elements of the max-min fair vector, and the other based on a single-objective optimization whose solution gives the max-min fair vector. We develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks' processing times. We verify the performance of our algorithms through extensive simulations, using a real traffic trace from a large Google cluster. 
    more » « less
  2. We study the max-min fairness of multi-task jobs in distributed computing platforms. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. Each job is associated with a utility which is a decreasing function of its completion time, and captures how sensitive it is to latency. The objective is to schedule tasks in a way that achieves max-min fairness for jobs’ utilities, i.e., an optimal schedule in which any attempt to improve the utility of a job necessarily results in hurting the utility of some other job with smaller or equal utility.We first show a strong result regarding NP-hardness of finding the max-min fair vector of job utilities. The implication of this result is that achieving max-min fairness in many other distributed scheduling problems (e.g., coflow scheduling) is NP-hard. We then proceed to define two notions of approximation solutions: one based on finding a certain number of elements of the max-min fair vector, and the other based on a single-objective optimization whose solution gives the max-min fair vector. We develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks’ processing times. We verify the performance of our algorithms through extensive simulations, using a real traffic trace from a large Google cluster. 
    more » « less
  3. High-performance computing (HPC) resources are used for compute-demanding calculations in various fields of science and engineering. They are large computational facilities utilized by many users simultaneously. High utilization often leads to high waiting times. Simulating users' behavior on such a system can help with future system design, develop user interventions, and ultimately improve the user’s experience and resource utilization. Here, we present HPCMod, an Agent-Based Modeling Framework for Modeling Users on HPC Resources. The key concept of the framework is the representation of the user's computational needs: the user project is represented as a collection of possibly dependent compute tasks. Each task can be executed as a single compute job or a series of jobs, depending on the task size. Some tasks can be too big to be executed in one chunk; such a situation often occurs during molecular dynamics simulation. There are multiple ways in which tasks can be split into jobs, and users will make their decisions based on previous experience, application parallel scalability, and available resources. For example, a user's compute task requires 32 node hours; it can be executed in multiple ways: a single 32-hour job on one node, two sequential 16-hour jobs on one node, one 16-hour job on two nodes, and so on. In the HPCMod, we implemented three models: 1) historical replay of compute jobs, 2) simulation of reconstituted compute tasks using historical job sizes, and 3) adaptive compute tasks splitting where users can modify jobs parameters given available resources till the execution of the next job in line. The framework was tested on a ten-node test system and a larger 1,736-node system modeled after a portion of TACC Stampede-2. The HPC resource model implements a first in first out (FIFO) scheduler with backfill scheduling. The initial results showed that on a tiny system, adaptive task-splitting is beneficial for the user but leads to a larger number of jobs. On a large system, the adaptive task-splitting was also very beneficial, decreasing waiting times for users using this strategy almost two times; however, other users got a 5% increase in their wait time. Further investigation is needed as the current task reconstitution algorithm is deterministic and does not allow quantification of job recombination uncertainties. The Julia-based implementation is fast: five years of historic workflow consisting of a million jobs and a one-hour stepping took around three minutes. 
    more » « less
  4. We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $$k$$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $$n$$ distributed nodes. The goal is to reduce the average execution time of the computational job. We provide a connection between the problem of characterizing the average execution time of a coded distributed computing system and the problem of analyzing the error probability of codes of length $$n$$ used over erasure channels. Accordingly, we present closed-form expressions for the execution time using binary random linear codes and the best execution time any linear-coded distributed computing system can achieve. It is also shown that there exist good binary linear codes that attain, asymptotically, the best performance any linear code, not necessarily binary, can achieve. We also investigate the performance of coded distributed computing systems using polar and Reed-Muller (RM) codes that can benefit from low-complexity decoding, and superior performance, respectively, as well as explicit constructions. The proposed framework in this paper can enable efficient designs of distributed computing systems given the rich literature in the channel coding theory. 
    more » « less
  5. The ability to accurately estimate job runtime properties allows a scheduler to effectively schedule jobs. State-of-the-art online cluster job schedulers use history-based learning, which uses past job execution information to estimate the runtime properties of newly arrived jobs. However, with fast-paced development in cluster technology (in both hardware and software) and changing user inputs, job runtime properties can change over time, which lead to inaccurate predictions. In this paper, we explore the potential and limitation of real-time learning of job runtime properties, by proactively sampling and scheduling a small fraction of the tasks of each job. Such a task-sampling-based approach exploits the similarity among runtime properties of the tasks of the same job and is inherently immune to changing job behavior. Our analytical and experimental analysis of 3 production traces with different skew and job distribution shows that learning in space can be substantially more accurate. Our simulation and testbed evaluation on Azure of the two learning approaches anchored in a generic job scheduler using 3 production cluster job traces shows that despite its online overhead, learning in space reduces the average Job Completion Time (JCT) by 1.28x, 1.56x, and 1.32x compared to the prior-art history-based predictor. Finally, we show how sampling-based learning can be extended to schedule DAG jobs and achieve similar speedups over the prior-art history-based predictor. 
    more » « less