Multiprocessor scheduling of hard real-time tasks modeled by directed acyclic graphs (DAGs) exploits the inherent parallelism presented by the model. For DAG tasks, a node represents a request to execute an object on one of the available processors. In one DAG task, there may be multiple execution requests for one object, each represented by a distinct node. These distinct execution requests offer an opportunity to reduce their combined cache overhead through coordinated scheduling of objects as threads within a parallel task. The goal of this work is to realize this opportunity by incorporating the cache-aware BUNDLE-scheduling algorithm into federated scheduling of sporadic DAG task sets.This is the first work to incorporate instruction cache sharing into federated scheduling. The result is a modification of the DAG model named the DAG with objects and threads (DAG-OT). Under the DAG-OT model, descriptions of nodes explicitly include their underlying executable object and number of threads. When possible, nodes assigned the same executable object are collapsed into a single node; joining their threads when BUNDLE-scheduled. Compared to the DAG model, the DAG-OT model with cache-aware scheduling reduces the number of cores allocated to individual tasks by approximately 20 percent in the synthetic evaluation and up to 50 percent on a novel parallel computing platform implementation. By reducing the number of allocated cores, the DAG-OT model is able to schedule a subset of previously infeasible task sets.
more »
« less
CPU Energy-Aware Parallel Real-Time Scheduling
Both energy-efficiency and real-time performance are critical requirements in many embedded systems applications such as self-driving car, robotic system, disaster response, and security/safety control. These systems entail a myriad of real-time tasks, where each task itself is a parallel task that can utilize multiple computing units at the same time. Driven by the increasing demand for parallel tasks, multi-core embedded processors are inevitably evolving to many-core. Existing work on real-time parallel tasks mostly focused on real-time scheduling without addressing energy consumption. In this paper, we address hard real-time scheduling of parallel tasks while minimizing their CPU energy consumption on multicore embedded systems. Each task is represented as a directed acyclic graph (DAG) with nodes indicating different threads of execution and edges indicating their dependencies. Our technique is to determine the execution speeds of the nodes of the DAGs to minimize the overall energy consumption while meeting all task deadlines. It incorporates a frequency optimization engine and the dynamic voltage and frequency scaling (DVFS) scheme into the classical real-time scheduling policies (both federated and global) and makes them energy-aware. The contributions of this paper thus include the first energy-aware online federated scheduling and also the first energy-aware global scheduling of DAGs. Evaluation using synthetic workload through simulation shows that our energy-aware real-time scheduling policies can achieve up to 68% energy-saving compared to classical (energy-unaware) policies. We have also performed a proof of concept system evaluation using physical hardware demonstrating the energy efficiency through our proposed approach.
more »
« less
- Award ID(s):
- 1724227
- PAR ID:
- 10183034
- Date Published:
- Journal Name:
- Leibniz international proceedings in informatics
- Volume:
- 165
- ISSN:
- 1868-8969
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With the technology trend of hardware and workload consolidation for embedded systems and the rapid development of edge computing, there has been increasing interest in supporting parallel real-time tasks to better utilize the multi-core platforms while meeting the stringent real-time constraints. For parallel real-time tasks, the federated scheduling paradigm, which assigns each parallel task a set of dedicated cores, achieves good theoretical bounds by ensuring exclusive use of processing resources to reduce interferences. However, because cores share the last-level cache and memory bandwidth resources, in practice tasks may still interfere with each other despite executing on dedicated cores. Such resource interferences due to concurrent accesses can be even more severe for embedded platforms or edge servers, where the computing power and cache/memory space are limited. To tackle this issue, in this work, we present a holistic resource allocation framework for parallel real-time tasks under federated scheduling. Under our proposed framework, in addition to dedicated cores, each parallel task is also assigned with dedicated cache and memory bandwidth resources. Further, we propose a holistic resource allocation algorithm that well balances the allocation between different resources to achieve good schedulability. Additionally, we provide a full implementation of our framework by extending the federated scheduling system with Intel’s Cache Allocation Technology and MemGuard. Finally, we demonstrate the practicality of our proposed framework via extensive numerical evaluations and empirical experiments using real benchmark programs.more » « less
-
While energy consumption is the primary concern for the design of real-time embedded systems, fault-tolerance and quality of service (QoS) are becoming increasingly important in the development of today’s pervasive computing systems. In this work, we study the problem of energy-aware standby-sparing for weakly hard real-time embedded systems. The standby-sparing systems adopt a primary processor and a spare processor to provide fault tolerance for both permanent and transient faults. In order to reduce energy consumption for such kind of systems, we proposed two novel scheduling schemes: one for (1,1)-hard tasks and one for general (m,k)-hard tasks which require that at least m out of any k consecutive jobs of a task meet their deadlines. Through extensive evaluations, our results demonstrate that the proposed techniques significantly outperform the previous research in reducing energy consumption for both (1,1)-hard task sets and general (m,k)-hard task sets while assuring fault tolerance through standby-sparing.more » « less
-
Real-time data stream processing at the edge is crucial for time-sensitive tasks within large-scale IoT systems. Task scheduling plays a key role in managing the Quality of Service (QoS), necessitating a prioritization system to distinguish between high and low-priority tasks, thus ensuring efficient data processing on edge nodes. Existing scheduling algorithms rigidly prioritize tasks deemed as high-priority, often at the expense of fairness and overall system efficiency. In this paper, we propose a Priority-aware Fair Task Scheduling (FTS-Hybrid) algorithm that addresses these challenges by managing priority based task execution in a controlled manner. Our task scheduling algorithm streamlines resource utilization and enhances system responsiveness, contributing to low latency and high throughput, outperforming competing techniques including First-Come-FirstServe (FCFS), Round Robin (RR), and Priority Scheduling (PS). We implemented FTS-Hybrid on Apache Storm and evaluated its performance using an open-source real-time IoT benchmark (RIoTBench). Experimental results show that the FTS-Hybrid algorithm reduces task execution latency by 24%, 31%, and 26% compared with FCFS, RR, and PS, respectively, by strategically mitigating queuing delays under dynamic workload conditions.more » « less
-
Papadopoulos, Alessandro V (Ed.)The rigid timing requirement of real-time applications biases the analysis to focus on the worst-case performances. Such a focus cannot provide enough information to optimize the system’s typical resource and energy consumption. In this work, we study the real-time scheduling of parallel tasks on a multi-speed heterogeneous platform while minimizing their typical-case CPU energy consumption. Dynamic power management (DPM) policy is integrated to determine the minimum number of cores required for each task while guaranteeing worst-case execution requirements (under all circumstances). A Hungarian Algorithm-based task partitioning technique is proposed for clustered multi-core platforms, where all cores within the same cluster run at the same speed at any time, while different clusters may run at different speeds. To our knowledge, this is the first work aiming to minimize typical-case CPU energy consumption (while ensuring the worst-case timing correctness for all tasks under any execution condition) through DPM for parallel tasks in a clustered platform. We demonstrate the effectiveness of the proposed approach with existing power management techniques using experimental results and simulations. The experimental results conducted on the Intel Xeon 2680 v3 12-core platform show around 7%-30% additional energy savings.more » « less