skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal Dynamic Task Scheduling in Heterogeneous Cloud Computing Environment
Cloud computing (CC), often necessitates dynamic adjustments due to its inherently fluid nature. In this paper, we introduce a novel dynamic task scheduling model that incorporates reward and holding cost considerations, leveraging the Continuous-Time Markov Decision Process (CTMDP) framework in heterogeneous CC systems. The primary goal of this model is to maximize the overall system reward for the Cloud Service Provider. By solving the Bellman Optimality Equation using the value-iteration method, we can derive an optimal scheduling policy for the dynamic task scheduling model. Additionally, to enhance its practicality in real-world scenarios, we incorporate a model-free reinforcement learning algorithm to obtain the optimal policy for our proposed model without requiring explicit knowledge of the system environment. Simulation results demonstrate that our proposed model outperforms two common static scheduling methods.  more » « less
Award ID(s):
2318662
PAR ID:
10544821
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-5346-4
Page Range / eLocation ID:
40 to 46
Subject(s) / Keyword(s):
Cloud Data Center Cognitive Network Optimal strategy Cost Efficiency Bandwidth Allocation.
Format(s):
Medium: X
Location:
BALI, Indonesia
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We study the deadline scheduling problem for multiple deferrable jobs that arrive in a random manner and are to be processed before individual deadlines. The processing of the jobs is subject to a time-varying limit on the total processing rate at each stage. We formulate the scheduling problem as a restless multi-armed bandit (RMAB) problem. Relaxing the scheduling problem into multiple independent single-arm scheduling problems, we define the Lagrangian priority value as the greatest tax under which it is optimal to activate the arm, and establish the asymptotic optimality of the proposed Lagrangian priority policy for large systems. Numerical results show that the proposed Lagrangian priority policy achieves 22%-49% higher average reward than the classical Whittle index policy (that does not take into account the processing rate limits). 
    more » « less
  2. The emerging connected-vehicle technology provides a new dimension for developing more intelligent traffic control algorithms for signalized intersections. An important challenge for scheduling in networked transportation systems is the switchover delay caused by the guard time before any traffic signal change. The switch-over delay can result in significant loss of system capacity and hence needs to be accommodated in the scheduling design. To tackle this challenge, we propose a distributed online scheduling policy that extends the wellknown Max-Pressure policy to address switch-over delay by introducing a bias factor favoring the current schedule. We prove that the proposed policy is throughput-optimal with switch-over delay. Furthermore, the proposed policy remains optimal when there are both connected signalized intersections and conventional fixed-time ones in the system. With connected-vehicle technology, the proposed policy can be easily incorporated into the current transportation systems without additional infrastructure. Through extensive simulation in VISSIM, we show that our policy indeed outperforms the existing popular policies. 
    more » « less
  3. In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by finding a policy that optimizes the worst-case performance over an uncertainty set of MDPs. While much of the literature has focused on discounted MDPs, robust average-reward MDPs remain largely unexplored. In this paper, we focus on robust average-reward MDPs, where the goal is to find a policy that optimizes the worst-case average reward over an uncertainty set. We first take an approach that approximates average-reward MDPs using discounted MDPs. We prove that the robust discounted value function converges to the robust average-reward as the discount factor goes to 1, and moreover when it is large, any optimal policy of the robust discounted MDP is also an optimal policy of the robust average-reward. We further design a robust dynamic programming approach, and theoretically characterize its convergence to the optimum. Then, we investigate robust average-reward MDPs directly without using discounted MDPs as an intermediate step. We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy. 
    more » « less
  4. Automating robotic surgery via learning from demonstration (LfD) techniques is extremely challenging. This is because surgical tasks often involve sequential decisionmaking processes with complex interactions of physical objects and have low tolerance for mistakes. Prior works assume that all demonstrations are fully observable and optimal, which might not be practical in the real world. This paper introduces a sample-efficient method that learns a robust reward function from a limited amount of ranked suboptimal demonstrations consisting of partial-view point cloud observations. The method then learns a policy by optimizing the learned reward function using reinforcement learning (RL). We show that using a learned reward function to obtain a policy is more robust than pure imitation learning. We apply our approach on a physical surgical electrocautery task and demonstrate that our method can perform well even when the provided demonstrations are suboptimal and the observations are highdimensional point clouds. 
    more » « less
  5. Semi-partitioned scheduling is an approach to multiprocessor real-time scheduling where most tasks are fixed to processors, while a small subset of tasks is allowed to migrate. This approach offers reduced overhead compared to global scheduling, and can reduce processor capacity loss compared to partitioned scheduling. Prior work has resulted in a number of semi-partitioned scheduling algorithms, but their correctness typically hinges on a complex intertwining of offline task assignment and online execution. This brittleness has resulted in few proposed semi-partitioned scheduling algorithms that support dynamic task systems, where tasks may join or leave the system at runtime, and few that are optimal in any sense. This paper introduces EDF-sc, the first semi-partitioned scheduling algorithm that is optimal for scheduling (static) soft real-time (SRT) sporadic task systems and allows tasks to dynamically join and leave. The SRT notion of optimality provided by EDF-sc requires deadline tardiness to be bounded for any task system that does not cause over-utilization. In the event that all tasks can be assigned as fixed, EDF-sc behaves exactly as partitioned EDF. Heuristics are provided that give EDF-sc the novel ability to stabilize the workload to approach the partitioned case as tasks join and leave the system. 
    more » « less