skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2
This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor that acts as an accelerator resource server, arbitrating accelerator access requests from all other callbacks at the application layer. This approach enables coordinated and priority-driven accelerator access management in multi-process robotic systems. The framework design is directly applicable to all types of accelerators and enables granular control over how specific chains access accelerators, making it possible to achieve predictable real-time support for accelerators used by safety-critical callback chains without making changes to underlying accelerator device drivers. The paper shows that PAAM also offers a theoretical analysis that can upper bound the worst-case response time of safety-critical callback chains that necessitate accelerator access. This paper also demonstrates that complex robotic systems with extensive accelerator usage that are integrated with PAAM may achieve up to a 91% reduction in end-to-end response time of their critical callback chains.  more » « less
Award ID(s):
1943265 2312395 1955650
PAR ID:
10527891
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-5841-4
Page Range / eLocation ID:
81 to 94
Format(s):
Medium: X
Location:
Hong Kong, Hong Kong
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In ROS (Robot Operating System), most applications in time- and safety-critical domain are constructed in the form of callback chains with data dependencies. Due to the shortcomings in its real-time support, ROS does not provide a strong timing guarantee and may lead to disastrous results. Although ROS2 claims to enhance the real-time capability, ensuring predictable end-to-end chain latency still remains a challenging problem. In this paper, we propose a new priority-driven chain-aware scheduler for the ROS2 framework and present end-to-end latency analysis for the proposed scheduler. With our scheduler, callbacks are prioritized based on the given timing requirements of the corresponding chains so that the end-to-end latency of critical chains can be improved with a predictable bound. The proposed scheduling design includes priority assignment and resource allocation considering all ROS2 scheduling-related abstractions, e.g., callbacks, nodes, and executors. To the best of our knowledge, this is the first work to address the inherent limitations of ROS2 in end-to-end latency by proposing a new scheduler design. We have implemented our scheduler in ROS2 running on NVIDIA Xavier NX. We have conducted case studies and schedulability experiments. The results show that the proposed scheduler yields a substantial improvement in end-to-end latency over the default ROS2 scheduler and the latest work in real-world scenarios. 
    more » « less
  2. This paper presents Latency Management Executor (LaME), a theory-guided adaptive scheduling framework that enhances real-time performance in ROS 2 through dynamic resource allocation and hybrid priority-driven scheduling. LaME introduces the concept of threadclasses to dynamically adjust system configurations, ensuring response-time guarantees for real-time chains while maintaining starvation freedom for best-effort chains. By implementing adaptive resource allocation and continuous runtime monitoring, LaME provides robust response times even under fluctuating workloads and resource constraints. We implement our framework for the Autoware reference system and perform our evaluation on an Nvidia Jetson platform. Our results demonstrate that LaME successfully adapts to changing resource availability and workload surges, and effectively balances real-time guarantees with overall system throughput. 
    more » « less
  3. Real-time systems are widely applied in different areas like autonomous vehicles, where safety is the key metric. However, on the FPGA platform, most of the prior accelerator frameworks omit discussing the schedulability in such real-time safety-critical systems, leaving deadlines unmet, which can lead to catastrophic system failures. To address this, we propose the ART framework, a hardware-software co-design approach that transforms baseline accelerators into “real-time guaranteed" accelerators. On the software side, ART performs schedulability analysis and preemption point placement, optimizing task scheduling to meet deadlines and enhance throughput. On the hardware side, ART integrates the Global Earliest Deadline First (GEDF) scheduling algorithm, implements preemption, and conducts source code transformation to transform baseline HLS-based accelerators into designs targeted for real-time systems capable of saving and resuming tasks. ART also includes integration, debugging, and testing tools for full-system implementation. We demonstrate the methodology of ART on two kinds of popular accelerator models and evaluate on AMD Versal VCK190 platform, where ART meets schedulability requirements that baseline accelerators fail. ART is lightweight, utilizing <0.5% resources. With about 100 lines of user input, ART generates about 2.5k lines of accelerator code, making it a push-button solution. 
    more » « less
  4. With the growing prevalence of edge AI, systems are increasingly required to meet stringent and diverse service level objectives (SLOs), such as maintaining specific accuracy levels, ensuring sufficient inference throughput, and meeting deadlines, often simultaneously. However, concurrently achieving these varied and complex SLOs is particularly challenging due to the resource constraints of edge devices and the heterogeneity of AI accelerators. To address this gap, we present a novel AI scheduling framework, Convergo, which uniquely integrates heterogeneous accelerator management, multi-tenancy, and multi-SLO prioritization into one scheduling solution. Convergo not only leverages heterogeneous AI accelerators and supports AI multi-tenancy, but also integrates scheduling heuristics to meet multiple SLOs concurrently. Convergo enables the simultaneous satisfaction of multiple/complex SLO requirements (e.g., accuracy, throughput, and deadline constraints). The scheduling algorithm prioritizes inference requests, imposes critical constraints, and selects the best model combinations for current inferencing. We evaluated Convergo on the Jetson Xavier platform with portable TPU accelerators across various AI workloads, demonstrating its effectiveness. The evaluation results show that Convergo outper- forms state-of-the-art baselines, achieving over 90% satisfaction of all three distinct SLO requirements simultaneously while maintaining approximately 95% satisfaction for individual SLOs. Furthermore, Convergo achieves these results with negligible overhead, making it a promising solution for edge AI systems. 
    more » « less
  5. Increasingly popular Robot Operating System (ROS) framework allows building robotic systems by integrating newly developed and/or reused modules, where the modules can use different versions of the framework (e.g., ROS1 or ROS2) and programming language (e.g. C++ or Python). The majority of such robotic systems' work happens in callbacks. The framework provides various elements for initializing callbacks and for setting up the execution of callbacks. It is the responsibility of developers to compose callbacks and their execution setup elements, and hence can lead to inconsistencies related to the setup of callback execution due to developer's incomplete knowledge of the semantics of elements in various versions of the framework. Some of these inconsistencies do not throw errors at runtime, making their detection difficult for developers. We propose a static approach to detecting such inconsistencies by extracting a static view of the composition of robotic system's callbacks and their execution setup, and then checking it against the composition conventions based on the elements' semantics. We evaluate our ROSCallBaX prototype on the dataset created from the posts on developer forums and ROS projects that are publicly available. The evaluation results show that our approach can detect real inconsistencies. 
    more » « less