Apache Mesos, a cluster-wide resource manager, is widely deployed in massive scale at several Clouds and Data Centers. Mesos aims to provide high cluster utilization via fine grained resource co-scheduling and resource fairness among multiple users through Dominant Resource Fairness (DRF) based allocation. DRF takes into account different resource types (CPU, Memory, Disk I/O) requested by each application and determines the share of each cluster resource that could be allocated to the applications. Mesos has adopted a two-level scheduling policy: (1) DRF to allocate resources to competing frameworks and (2) task level scheduling by each framework for the resources allocated during the previous step. We have conducted experiments in a local Mesos cluster when used with frameworks such as Apache Aurora, Marathon, and our own framework Scylla, to study resource fairness and cluster utilization. Experimental results show how informed decision regarding second level scheduling policy of frameworks and attributes like offer holding period, offer refusal cycle and task arrival rate can reduce unfair resource distribution. Bin-Packing scheduling policy on Scylla with Marathon can reduce unfair allocation from 38% to 3%. By reducing unused free resources in offers we bring down the unfairness from to 90% to 28%. We also showmore »
Tromino: Demand and DRF Aware Multi-Tenant Queue Manager for Apache Mesos Cluster
Apache Mesos, a two-level resource scheduler, provides resource sharing across multiple users in a multi-tenant clustered environment. Computational resources (i.e., CPU, memory, disk, etc.) are distributed according to the Dominant Resource Fairness (DRF) policy. Mesos frameworks (users) receive resources based on their current usage and are responsible for scheduling their tasks within the allocation. We have observed that multiple frameworks can cause fairness imbalance in a multi-user environment. For example, a greedy framework consuming more than its fair share of resources can deny resource fairness to others. The user with the least Dominant Share is considered first by the DRF module to get its resource allocation. However, the default DRF implementation, in Apache Mesos' Master allocation module, does not consider the overall resource demands of the tasks in the queue for each user/framework. This lack of awareness can lead to poor performance as users without any pending task may receive more resource offers, and users with a queue of pending tasks can starve due to their high dominant shares. In a multi-tenant environment, the characteristics of frameworks and workloads must be understood by cluster managers to be able to define fairness based on not only resource share but also resource more »
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC)
- Sponsoring Org:
- National Science Foundation
More Like this
We first consider the static problem of allocating resources to (i.e., scheduling) multiple distributed application frameworks, possibly with different priorities and server preferences, in a private cloud with heterogeneous servers. Several fair scheduling mechanisms have been proposed for this purpose. We extend prior results on max-min fair (MMF) and proportional fair (PF) scheduling to this constrained multiresource and multiserver case for generic fair scheduling criteria. The task efficiencies (a metric related to proportional fairness) of max- min fair allocations found by progressive filling are compared by illustrative examples. In the second part of this paper, we consider the online problem (with framework churn) by implementing variants of these schedulers in Apache Mesos using progressive filling to dynamically approximate max-min fair allocations. We evaluate the implemented schedulers in terms of overall execution time of realistic distributed Spark workloads. Our experiments show that resource efficiency is improved and execution times are reduced when the scheduler is “server specific” or when it leverages characterized required resources of the workloads (when known).
With the technology trend of hardware and workload consolidation for embedded systems and the rapid development of edge computing, there has been increasing interest in supporting parallel real-time tasks to better utilize the multi-core platforms while meeting the stringent real-time constraints. For parallel real-time tasks, the federated scheduling paradigm, which assigns each parallel task a set of dedicated cores, achieves good theoretical bounds by ensuring exclusive use of processing resources to reduce interferences. However, because cores share the last-level cache and memory bandwidth resources, in practice tasks may still interfere with each other despite executing on dedicated cores. Such resource interferences due to concurrent accesses can be even more severe for embedded platforms or edge servers, where the computing power and cache/memory space are limited. To tackle this issue, in this work, we present a holistic resource allocation framework for parallel real-time tasks under federated scheduling. Under our proposed framework, in addition to dedicated cores, each parallel task is also assigned with dedicated cache and memory bandwidth resources. Further, we propose a holistic resource allocation algorithm that well balances the allocation between different resources to achieve good schedulability. Additionally, we provide a full implementation of our framework by extending themore »
We consider an LTE downlink scheduling system where a base station allocates resource blocks (RBs) to users running delay-sensitive applications. We aim to find a scheduling policy that minimizes the queuing delay experienced by the users. We formulate this problem as a Markov Decision Process (MDP) that integrates the channel quality indicator (CQI) of each user in each RB, and queue status of each user. To solve this complex problem involving high dimensional state and action spaces, we propose a Deep Reinforcement Learning based scheduling framework that utilizes the Deep Deterministic Policy Gradient (DDPG) algorithm to minimize the queuing delay experienced by the users. Our extensive experiments demonstrate that our approach outperforms state-of-the-art benchmarks in terms of average throughput, queuing delay, and fairness, achieving up to 55% lower queuing delay than the best benchmark.
Towards Enabling Dynamic Resource Estimation and Correction for Improving Utilization in an Apache Mesos Cloud EnvironmentAcademic cloud infrastructures require users to specify an estimate of their resource requirements. The resource usage for applications often depends on the input file sizes, parameters, optimization flags, and attributes, specified for each run. Incorrect estimation can result in low resource utilization of the entire infrastructure and long wait times for jobs in the queue. We have designed a Resource Utilization based Migration (RUMIG) system to address the resource estimation problem. We present the overall architecture of the two-stage elastic cluster design, the Apache Mesos-specific container migration system, and analyze the performance for several scientific workloads on three different cloud/cluster environments. In this paper we (b) present a design and implementation for container migration in a Mesos environment, (c) evaluate the effect of right-sizing and cluster elasticity on overall performance, (d) analyze different profiling intervals to determine the best fit, (e) determine the overhead of our profiling mechanism. Compared to the default use of Apache Mesos, in the best cases, RUMIG provides a gain of 65% in runtime (local cluster), 51% in CPU utilization in the Chameleon cloud, and 27% in memory utilization in the Jetstream cloud.