skip to main content


Search for: All records

Award ID contains: 1717589

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    Different global and semi-partitioned schedulers have been proposed that are soft-real-time (SRT) optimal for sporadic task systems, meaning they can guarantee bounded deadline tardiness. However, under known analyses, tardiness bounds increase with respect to the number of processors, which reduces the applicability of these schedulers in systems with a large number of processors. In this paper, a semi-clustered scheduler, SC-EDF, is presented that has a constant tardiness bound. SC-EDF partitions tasks into clusters, each of which may include one fractional processor. Each cluster is scheduled by G-EDF, and the fractional processors are realized using Pfair scheduling techniques. 
    more » « less
  2. Graphics processing units (GPUs) manufactured by NVIDIA continue to dominate many fields of research, including real-time GPU-management. NVIDIA’s status as a key enabling technology for deep learning and image processing makes this unsurprising, especially when combined with the company’s push into embedded, safety-critical domains like autonomous driving. NVIDIA’s primary competitor, AMD, has received comparatively little attention, due in part to few embedded offerings and a lack of support from popular deep-learning toolkits. Recently, however, AMD’s ROCm (Radeon Open Compute) software platform was made available to address at least the second of these two issues, but is ROCm worth the attention of safety-critical software developers? In order to answer this question, this paper explores the features and pitfalls of AMD GPUs, focusing on contrasting details with NVIDIA’s GPU hardware and software. We argue that an open software stack such as ROCm may be able to provide much-needed flexibility and reproducibility in the context of real-time GPU research, where new algorithmic or analysis techniques should typically remain agnostic to the underlying GPU architecture. In support of this claim, we summarize how closed-source platforms have obstructed prior research using NVIDIA GPUs, and then demonstrate that AMD may be a viable alternative by modifying components of the ROCm software stack to implement spatial partitioning. Finally, we present a case study using the PyTorch deep-learning framework that demonstrates the impact such modifications can have on complex real-world software. 
    more » « less
  3. The applicability of Simultaneous Multithreading (SMT) to real-time systems has been hampered by the difficulty of obtaining reliable execution costs in an SMT-enabled system. This problem is addressed by introducing a scheduling framework, called CERT-MT, that combines scheduling-aware timing analysis with a cyclic-executive scheduler in a way that minimizes SMT-related timing variations. The proposed scheduling-aware timing analysis is based on maximum observed execution times and accounts for the uncertainty inherent in measurement-based timing analysis. The timing analysis is found to work for tasks with and without SMT, though some adjustments are required in the former case. A large-scale schedulability study is presented that shows CERT-MT can schedule systems with total utilizations approaching 1.4 times the core count, without sacrificing safety. 
    more » « less
  4. Autonomous vehicles often employ computer-vision (CV) algorithms that track the movements of pedestrians and other vehicles to maintain safe distances from them. These algorithms are usually expressed as real-time processing graphs that have cycles due to back edges that provide history information. If immediate back history is required, then such a cycle must execute sequentially. Due to this requirement, any graph that contains a cycle with utilization exceeding 1.0 is categorically unschedulable, i.e., bounded graph response times cannot be guaranteed. Unfortunately, such cycles can occur in practice, particularly if conservative execution-time assumptions are made, as befits a safety-critical system. This dilemma can be obviated by allowing older back history, which enables parallelism in cycle execution at the expense of possibly affecting the accuracy of tracking. However, the efficacy of this solution hinges on the resulting history-vs.-accuracy trade-off that it exposes. In this paper, this trade-off is explored in depth through an experimental study conducted using the open-source CARLA autonomous driving simulator. Somewhat surprisingly, easing away from always requiring immediate back history proved to have only a marginal impact on accuracy in this study. 
    more » « less
  5. Many computer-vision (CV) applications used in autonomous vehicles rely on historical results, which introduce cycles in processing graphs. However, existing response-time analysis breaks down in the presence of cycles, either by failing completely or by drastically sacrificing parallelism or CV accuracy. To address this situation, this paper presents a new graph-based task model, based on the recently ratified OpenVX standard, that includes historical requirements and their induced cycles as first-class concepts. Using this model, response-time bounds for graphs that may contain cycles are derived. These bounds expose a tradeoff between responsiveness and CV accuracy that hinges on the extent of allowed parallelism. This tradeoff is illustrated via a CV case study involving pedestrian tracking. In this case study, the methods proposed in this paper enabled significant improvements in both analytical and observed response times, with acceptable CV accuracy, compared to prior methods. 
    more » « less
  6. When designing a real-time multiprocessor locking protocol, the allowance of lock nesting creates complications that can kill parallelism. Such protocols are typically designed by focusing on the arbitration of resource requests that should be prohibited from executing concurrently. This paper proposes “concurrency groups,” a new concept that reflects an alternative point of view that focuses instead on requests that can be allowed to execute concurrently. A concurrency group is simply a group of lock requests, determined offline, that can safely execute together. This paper’s main contribution is the CGLP, a new real-time multiprocessor locking protocol that supports lock nesting through the use of concurrency groups. The CGLP is able to reap runtime parallelism benefits that have eluded prior protocols by investing effort offline in the construction of concurrency groups. A schedulability study is presented to quantify these benefits, as well as an efficient approach to determining such groups using an Integer Linear Program (ILP) solver. 
    more » « less
  7. Prior work has shown that the global earliest-deadline-first (GEDF) scheduler is soft real-time (SRT)-optimal for sporadic task systems in a variety of contexts, meaning that bounded deadline tardiness can be guaranteed under it for any task system that does not cause platform overutilization. However, one particularly compelling context has remained elusive: multiprocessor platforms in which tasks have affinity masks that determine the processors where they may execute. Actual GEDF implementations, such as the SCHED_DEADLINE class in Linux, have dealt with this unresolved question by foregoing SRT guarantees once affinity masks are set. This unresolved question, as it pertains to SCHED_DEADLINE, was included by Peter Zijlstra in a list of important open problems affecting Linux in his keynote talk at ECRTS 2017. In this paper, this question is resolved along with another open problem that at first blush seems unrelated but actually is. Specifically, both problems are closed by establishing two results. First, a proof strategy used previously to establish GEDF tardiness bounds that are exponential in size on heterogeneous uniform multiprocessors is generalized to show that polynomial bounds exist on a wider class of platforms. Second, both uniform multiprocessors and identical multiprocessors with affinities are shown to be within this class. These results yield the first polynomial GEDF tardiness bounds for the uniform case and the first such bounds of any kind for the identical-with-affinities case. 
    more » « less
  8. Existing models used in real-time scheduling are inadequate to take advantage of simultaneous multithreading (SMT), which has been shown to improve performance in many areas of computing, but has seen little application to real-time systems. The SMART task model, which allows for combining SMT and real time by accounting for the variable task execution costs caused by SMT, is introduced, along with methods and conditions for scheduling SMT tasks under global earliest-deadline-first scheduling. The benefits of using SMT are demonstrated through a large-scale schedulability study in which we show that task systems with utilizations 30% larger than what would be schedulable without SMT can be correctly scheduled. This artifact includes benchmark experiments used to compare execution times with and without SMT and code to duplicate the reported schedulability experiments. 
    more » « less