skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predictable GPU Sharing in Component-Based Real-Time Systems
This paper presents a real-time locking protocol whose design was motivated by the goal of enabling safe GPU sharing in time-sliced component-based systems. This locking protocol enables a GPU to be shared concurrently across, and utilized within, isolated components with predictable execution times. It relies on a novel resizing technique where GPU work is dimensioned on-the-fly to run on partitions of an NVIDIA GPU. This technique can be applied to any component that internally utilizes global CPU scheduling. The proposed locking protocol enables increased GPU parallelism and reduces GPU capacity loss with analytically provable benefits.  more » « less
Award ID(s):
2038855 2333120 2151829
PAR ID:
10560544
Author(s) / Creator(s):
; ; ;
Editor(s):
Pellizzoni, Rodolfo
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
298
ISSN:
1868-8969
ISBN:
978-3-95977-324-9
Page Range / eLocation ID:
298-298
Subject(s) / Keyword(s):
GPU locking protocols real-time locking protocols priority-inversion blocking component-based systems Computer systems organization → Real-time systems
Format(s):
Medium: X Size: 22 pages; 1649277 bytes Other: application/pdf
Size(s):
22 pages 1649277 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Homodyne detection is a common self-referenced technique to extract optical quadratures. Due to ubiquitous fluctuations, experiments measuring optical quadratures require homodyne angle control. Current homodyne angle locking techniques only provide high quality error signals in a span significantly smaller thanπradians, the span required for full state tomography, leading to inevitable discontinuities during full tomography. Here, we present and demonstrate a locking technique using a universally tunable modulator which produces high quality error signals at an arbitrary homodyne angle. Our work enables continuous full-state tomography and paves the way to backaction evasion protocols based on a time-varying homodyne angle. 
    more » « less
  2. When designing a real-time multiprocessor locking protocol, the allowance of lock nesting creates complications that can kill parallelism. Such protocols are typically designed by focusing on the arbitration of resource requests that should be prohibited from executing concurrently. This paper proposes "concurrency groups," a new concept that reflects an alternative point of view that focuses instead on requests that can be allowed to execute concurrently. A concurrency group is simply a group of lock requests, determined offline, that can safely execute together. This paper's main contribution is the CGLP, a new real-time multiprocessor locking protocol that supports lock nesting through the use of concurrency groups. The CGLP is able to reap runtime parallelism benefits that have eluded prior protocols by investing effort offline in the construction of concurrency groups. A schedulability study is presented to quantify these benefits, as well as an efficient approach to determining such groups using an Integer Linear Program (ILP) solver. 
    more » « less
  3. When designing a real-time multiprocessor locking protocol, the allowance of lock nesting creates complications that can kill parallelism. Such protocols are typically designed by focusing on the arbitration of resource requests that should be prohibited from executing concurrently. This paper proposes “concurrency groups,” a new concept that reflects an alternative point of view that focuses instead on requests that can be allowed to execute concurrently. A concurrency group is simply a group of lock requests, determined offline, that can safely execute together. This paper’s main contribution is the CGLP, a new real-time multiprocessor locking protocol that supports lock nesting through the use of concurrency groups. The CGLP is able to reap runtime parallelism benefits that have eluded prior protocols by investing effort offline in the construction of concurrency groups. A schedulability study is presented to quantify these benefits, as well as an efficient approach to determining such groups using an Integer Linear Program (ILP) solver. 
    more » « less
  4. Transaction isolation is conventionally achieved by restricting access to the physical items in a database. To maximize performance, isolation functionality is often packaged with recovery, I/O, and data access methods in a monolithic transactional storage manager. While this design has historically afforded high performance in online transaction processing systems, industry trends indicate a growing need for a new approach in which intertwined components of the transactional storage manager are disaggregated into modular services. This paper presents a new method to modularize the isolation component. Our work builds on predicate locking, an isolation mechanism that enables this modularization by locking logical rather than physical items in a database. Predicate locking is rarely used as the core isolation mechanism because of its high theoretical complexity and perceived overhead. However, we show that this overhead can be substantially reduced in practice by optimizing for common predicate structures. We present DIBS, a transaction scheduler that employs our predicate locking optimizations to guarantee isolation as a modular service. We evaluate the performance of DIBS as the sole isolation mechanism in a data processing system. In this setting, DIBS scales up to 10.5 million transactions per second on a TATP workload. We also explore how DIBS can be applied to existing database systems to increase transaction throughput. DIBS reduces per-transaction file system writes by 90% on TATP in SQLite, resulting in a 3X improvement in throughput. Finally, DIBS reduces row contention on YCSB in MySQL, providing serializable isolation with a 1.4X improvement in throughput. 
    more » « less
  5. Pellizzoni, Rodolfo (Ed.)
    The goal of a real-time locking protocol is to reduce any priority-inversion blocking (pi-blocking) a task may incur while waiting to access a shared resource. For mutual-exclusion sharing on an m-processor platform, the best existing lower bound on per-task pi-blocking under suspension-oblivious analysis is a (trivial) lower bound of (m-1) request lengths under any job-level fixed-priority (JLFP) scheduler. Surprisingly, most asymptotically optimal locking protocols achieve a per-task pi-blocking upper bound of (2m-1) request lengths under JLFP scheduling, even though a range of very different mechanisms are used in these protocols. This paper closes the gap between these existing lower and upper bounds by establishing a lower bound of (2m-2) request lengths under global fixed-priority (G-FP) and global earliest-deadline-first (G-EDF) scheduling. This paper also shows that worst-case per-task pi-blocking can be arbitrarily close to (2m-1) request lengths for locking protocols that satisfy a certain property that is met by most (if not all) existing locking protocols. These results imply that most known asymptotically optimal locking protocols are almost truly optimal (not just asymptotic) under G-FP and G-EDF scheduling. 
    more » « less