Due to a growing interest in deep learning applications [5], compute-intensive and long-running (hours to days) training jobs have become a significant component of datacenter workloads. A large fraction of these jobs is often exploratory, with the goal of determining the best model structure (e.g., the number of layers and channels in a convolutional neural network), hyperparameters (e.g., the learning rate), and data augmentation strategies for the target application. Notably, training jobs are often terminated early if their learning metrics (e.g., training and validation accuracy) are not converging, with only a few completing successfully. For this motivating application, we consider the problem of scheduling a set of jobs that can be terminated at predetermined checkpoints with known probabilities estimated from historical data. We prove that, in order to minimize the time to complete the first K successful jobs on a single server, optimal scheduling does not require preemption (even when preemption overhead is negligible) and provide an optimal policy; advantages of this policy are quantified through simulation. Related Work. While job scheduling has been investigated extensively in many scenarios (see [6] and [2] for a survey of recent result), most policies require that the cost of waiting times of each job be known at scheduling time; in contrast, in our setting the scheduler does not know which job will be the K-th successful job, and sojourn times of subsequent jobs do not contribute to the target metric. For example, [4, 3] minimize makespan (i.e., the time to complete all jobs) for known execution times and waiting time costs; similarly, Gittins index [1] and SR rank [7] minimize expected sojourn time of all jobs, i.e., both successfully completed jobs and jobs terminated early. Unfortunately, scheduling policies not distinguishing between these two types of jobs may favor jobs where the next stage is short and leads to early termination with high probability, which is an undesirable outcome in our applications of interest.
more »
« less
The most common queueing theory questions asked by computer systems practitioners
This document examines five performance questions which are repeatedly asked by practitioners in industry: (i) My system utilization is very low, so why are job delays so high? (ii) What should I do to lower job delays? (iii) How can I favor short jobs if I don't know which jobs are short? (iv) If some jobs are more important than others, how do I negotiate importance versus size? (v) How do answers change when dealing with a closed-loop system, rather than an open system? All these questions have simple answers through queueing theory. This short paper elaborates on the questions and their answers. To keep things readable, our tone is purposely informal throughout. For more formal statements of these questions and answers, please see [14].
more »
« less
- PAR ID:
- 10349145
- Date Published:
- Journal Name:
- ACM SIGMETRICS Performance Evaluation Review
- Volume:
- 49
- Issue:
- 4
- ISSN:
- 0163-5999
- Page Range / eLocation ID:
- 3 to 7
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Data center workloads are composed of multiresource jobs requiring a variety of computational resources including CPU cores, memory, disk space, and hardware accelerators. Mod- ern servers can run multiple jobs in parallel, but a set of jobs can only run in parallel if the server has sufficient resources to satisfy the demands of each job. It is generally hard to find sets of jobs that perfectly utilize all server resources, and choosing the wrong set of jobs can lead to low resource uti- lization. This raises the question of how to allocate resources across a stream of arriving multiresource jobs to minimize the mean response time across jobs — the mean time from when a job arrives to the system until it is complete. Current policies for scheduling multiresource jobs are com- plex to analyze and hard to implement. We propose a class of simple policies, called Markovian Service Rate (MSR) policies. We show that the class of MSR policies is throughput- optimal, in that if a policy exists that can stabilize the sys- tem, then an MSR policy exists that stabilizes the system. We derive bounds on the mean response time under an MSR policy, and show how our bounds can be used to choose an MSR policy that minimizes mean response time.more » « less
-
Abstract Twister2 is an open‐source big data hosting environment designed to process both batch and streaming data at scale. Twister2 runs jobs in both high‐performance computing (HPC) and big data clusters. It provides a cross‐platform resource scheduler to run jobs in diverse environments. Twister2 is designed with a layered architecture to support various clusters and big data problems. In this paper, we present the cross‐platform resource scheduler of Twister2. We identify required services and explain implementation details. We present job startup delays for single jobs and multiple concurrent jobs in Kubernetes and OpenMPI clusters. We compare job startup delays for Twister2 and Spark at a Kubernetes cluster. In addition, we compare the performance of terasort algorithm on Kubernetes and bare metal clusters at AWS cloud.more » « less
-
Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Essentially all of the theoretical work on multiserver-job systems focuses on maximizing utilization, with almost nothing known about mean response time. In simpler settings, such as various known-size single-server-job settings, minimizing mean response time is merely a matter of prioritizing small jobs. However, for the multiserver-job system, prioritizing small jobs is not enough, because we must also ensure servers are not unnecessarily left idle. Thus, minimizing mean response time requires prioritizing small jobs while simultaneously maximizing throughput. Our question is how to achieve these joint objectives. We devise the ServerFilling-SRPT scheduling policy, which is the first policy to minimize mean response time in the multiserver-job model in the heavy traffic limit. In addition to proving this heavy-traffic result, we present empirical evidence that ServerFilling-SRPT outperforms all existing scheduling policies for all loads, with improvements by orders of magnitude at higher loads. Because ServerFilling-SRPT requires knowing job sizes, we also define the ServerFilling-Gittins policy, which is optimal when sizes are unknown or partially known.more » « less
-
null (Ed.)Cloud computing today is dominated by multi-server jobs. These are jobs that request multiple servers simultaneously and hold onto all of these servers for the duration of the job. Multi-server jobs add a lot of complexity to the traditional one-server-per-job model: an arrival might not "fit" into the available servers and might have to queue, blocking later arrivals and leaving servers idle. From a queueing perspective, almost nothing is understood about multi-server job queueing systems; even understanding the exact stability region is a very hard problem. In this paper, we investigate a multi-server job queueing model under scaling regimes where the number of servers in the system grows. Specifically, we consider a system with multiple classes of jobs, where jobs from different classes can request different numbers of servers and have different service time distributions, and jobs are served in first-come-first-served order. The multi-server job model opens up new scaling regimes where both the number of servers that a job needs and the system load scale with the total number of servers. Within these scaling regimes, we derive the first results on stability, queueing probability, and the transient analysis of the number of jobs in the system for each class. In particular we derive sufficient conditions for zero queueing. Our analysis introduces a novel way of extracting information from the Lyapunov drift, which can be applicable to a broader scope of problems in queueing systems.more » « less
An official website of the United States government

