skip to main content


Title: Exploration on Task Scheduling Strategy of CPU-GPU Heterogeneous Computing System
For a CPU-GPU heterogeneous computing system, different types of processors have load balancing problems in the calculation process. What’s more, multitasking cannot be matched to the appropriate processor core is also an urgent problem to be solved. In this paper, we propose a task scheduling strategy for high-performance CPU-GPU heterogeneous computing platform to solve these problems. For the single task model, a task scheduling strategy based on loadaware for CPU-GPU heterogeneous computing platform is proposed. This strategy detects the computing power of the CPU and GPU to process specified tasks, and allocates computing tasks to the CPU and GPU according to the perception ratio. The tasks are stored in a bidirectional queue to reduce the additional overhead brought by scheduling. For the multi-task model, a task scheduling strategy based on the genetic algorithm for CPU-GPU heterogeneous computing platform is proposed. The strategy aims at improving the overall operating efficiency of the system, and accurately binds the execution relationship between different types of tasks and heterogeneous processing cores. Our experimental results show that the scheduling strategy can improve the efficiency of parallel computing as well as system performance.  more » « less
Award ID(s):
1828105
NSF-PAR ID:
10347479
Author(s) / Creator(s):
Date Published:
Journal Name:
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Graphics Processing Units (GPU) are increasingly deployed on Cyber-physical Systems (CPSs), frequently used to perform real-time safety-critical functions, such as object detection on autonomous vehicles. As a result, availability is important for GPU tasks in CPS platforms. However, existing Trusted Execution Environments (TEE) solutions with availability guarantees focus only on CPU computing.To bridge this gap, we propose AvaGPU, a TEE that guarantees real-time availability for CPU tasks involving GPU execution under compromised OS. There are three technical challenges. First, to prevent malicious resource contention due to separate scheduling of CPU and GPU tasks, we proposed a CPU-GPU co-scheduling framework that couples the priority of CPU and GPU tasks. Second, we propose software-based secure preemption on GPU tasks to bound the degree of priority inversion on GPU. Third, we propose a new split design of GPU driver with minimized Trusted Computing Base (TCB) to achieve secure and efficient GPU management for CPS. We implement a prototype of AvaGPU on the Jetson AGX Orin platform. The system is evaluated on benchmark, synthetic tasks, and real-world applications with 15.87% runtime overhead on average. 
    more » « less
  2. While High Performance Computing systems are increas-ingly based on heterogeneous cores, their effectiveness depends on howwell the scheduler can allocate workloads onto appropriate computing de-vices and how communication and computation can be overlapped. Withdifferent types of resources integrated into one system, the complexity ofthe scheduler correspondingly increases. Moreover, for applications withvarying problem sizes on different heterogeneous resources, the optimalscheduling approach may vary accordingly. We thus present PDAWL, anevent-driven profile-based Iterative Dynamic Adaptive Work-Load bal-ance scheduling approach to dynamically and adaptively adjust workloadto efficiently utilize heterogeneous resources. It combines online schedul-ing (DAWL), which can adaptively adjust workload based on availablereal time heterogeneous resources, with an offline machine learning (profile-based estimation model) which can build a device-specific communica-tion computation estimation model. Our scheduling approach is tested oncontrol-regular applications, Stencil kernel (based on a Jacobi Algorithm)and Sparse Matrix-Vector Multiplication (SpMV) in an event-driven run-time system. Experimental results show that PDAWL is either on-par orfar outperforms whichever yields the best results (CPU or GPU). 
    more » « less
  3. While High Performance Computing systems are increasingly based on heterogeneous cores, their e ectiveness depends on how well the scheduler can allocate workloads onto appropriate computing devices and how communication and computation can be overlapped. With di erent types of resources integrated into one system, the complexity of the scheduler correspondingly increases. Moreover, for applications with varying problem sizes on di erent heterogeneous resources, the optimal scheduling approach may vary accordingly. We thus present PDAWL, an event-driven pro le-based Iterative Dynamic Adaptive Work-Load balance scheduling approach to dynamically and adaptively adjust workload to eciently utilize heterogeneous resources. It combines online scheduling (DAWL), which can adaptively adjust workload based on available real time heterogeneous resources, with an oine machine learning (pro lebased estimation model) which can build a device-speci c communication computation estimation model. Our scheduling approach is tested on control-regular applications, Stencil kernel (based on a Jacobi Algorithm) and Sparse Matrix-Vector Multiplication (SpMV) in an event-driven runtime system. Experimental results show that PDAWL is either on-par or far outperforms whichever yields the best results (CPU or GPU). 
    more » « less
  4. Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer. However, CPU and GPU applications have very different traffic patterns in general. Without effective management of the network resource, some chiplets can suffer significant performance degradation because the network bandwidth is taken away by communication-intensive applications. Therefore, techniques need to be developed to effectively manage the shared network resources. In a chiplet-based system, resource management needs to not only react in real-time but also be cost-efficient. In this work, we propose a reconfigurable network architecture, leveraging Kalman Filter to make accurate predictions on network resources needed by the applications and then adaptively change the resource allocation. Using our design, the network bandwidth can be fairly allocated to avoid starvation or performance degradation. Our evaluation results show that the proposed reconfigurable interconnection network can dynamically react to the changes in traffic demand of the chiplets and improve the system performance with low cost and design complexity. 
    more » « less
  5. Both energy-efficiency and real-time performance are critical requirements in many embedded systems applications such as self-driving car, robotic system, disaster response, and security/safety control. These systems entail a myriad of real-time tasks, where each task itself is a parallel task that can utilize multiple computing units at the same time. Driven by the increasing demand for parallel tasks, multi-core embedded processors are inevitably evolving to many-core. Existing work on real-time parallel tasks mostly focused on real-time scheduling without addressing energy consumption. In this paper, we address hard real-time scheduling of parallel tasks while minimizing their CPU energy consumption on multicore embedded systems. Each task is represented as a directed acyclic graph (DAG) with nodes indicating different threads of execution and edges indicating their dependencies. Our technique is to determine the execution speeds of the nodes of the DAGs to minimize the overall energy consumption while meeting all task deadlines. It incorporates a frequency optimization engine and the dynamic voltage and frequency scaling (DVFS) scheme into the classical real-time scheduling policies (both federated and global) and makes them energy-aware. The contributions of this paper thus include the first energy-aware online federated scheduling and also the first energy-aware global scheduling of DAGs. Evaluation using synthetic workload through simulation shows that our energy-aware real-time scheduling policies can achieve up to 68% energy-saving compared to classical (energy-unaware) policies. We have also performed a proof of concept system evaluation using physical hardware demonstrating the energy efficiency through our proposed approach. 
    more » « less