Autonomy Today: Many Delay-Prone Black Boxes

Liu, Sizhe; Wagle, Rohan; Anderson, James H; Yang, Ming; Zhang, Chi; Li, Yunhua

doi:10.4230/LIPICS.ECRTS.2024.12

Citation Details

Autonomy Today: Many Delay-Prone Black Boxes

Machine-learning (ML) technology has been a key enabler in the push towards realizing ever more sophisticated autonomous-driving features. In deploying such technology, the automotive industry has relied heavily on using "black-box" software and hardware components that were originally intended for non-safety-critical contexts, without a full understanding of their real-time capabilities. A prime example of such a component is CUDA, which is fundamental to the acceleration of ML algorithms using NVIDIA GPUs. In this paper, evidence is presented demonstrating that CUDA can cause unbounded task delays. Such delays are the result of CUDA’s usage of synchronization mechanisms in the POSIX thread (pthread) library, so the latter is implicated as a delay-prone component as well. Such synchronization delays are shown to be the source of a system failure that occurred in an actual autonomous vehicle system during testing at WeRide. Motivated by these findings, a broader experimental study is presented that demonstrates several real-time deficiencies in CUDA, the glibc pthread library, Linux, and the POSIX interface of the safety-certified QNX Operating System for Safety. Partial mitigations for these deficiencies are presented and further actions are proposed for real-time researchers and developers to integrate more complete mitigations. more »

Award ID(s):: 2038855 2333120 2151829

PAR ID:: 10560543

Author(s) / Creator(s):: Liu, Sizhe; Wagle, Rohan; Anderson, James H; Yang, Ming; Zhang, Chi; Li, Yunhua

Editor(s):: Pellizzoni, Rodolfo

Publisher / Repository:: Schloss Dagstuhl – Leibniz-Zentrum für Informatik

Date Published:: 2024-01-01

Volume:: 298

ISSN:: 1868-8969

ISBN:: 978-3-95977-324-9

Page Range / eLocation ID:: 298-298

Subject(s) / Keyword(s):: autonomous driving CUDA programming locking protocols POSIX thread operating systems machine learning systems real-time systems Computer systems organization → Real-time operating systems Software and its engineering → Process synchronization

Format(s):: Medium: X Size: 27 pages; 2698899 bytes Other: application/pdf

Size(s):: 27 pages 2698899 bytes

Right(s):: Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.4230/LIPICS.ECRTS.2024.12

More Like this