skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PASS: A System-Driven Evaluation Platform for Autonomous Driving Safety and Security
Safety and security play critical roles for the success of Autonomous Driving (AD) systems. Since AD systems heavily rely on AI components, the safety and security research of such components has also received great attention in recent years. While it is widely recognized that AI component-level (mis)behavior does not necessarily lead to AD system-level impacts, most of existing work still only adopts component-level evaluation. To fill such critical scientific methodology-level gap from component-level to real system-level impact, a system-driven evaluation platform jointly constructed by the community could be the solution. In this paper, we present PASS (Platform for Auto-driving Safety and Security), a system-driven evaluation prototype based on simulation. By sharing our platform building concept and preliminary efforts, we hope to call on the community to build a uniform and extensible platform to make AI safety and security work sufficiently meaningful at the system level.  more » « less
Award ID(s):
1929771 1932464 2145493
PAR ID:
10359464
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
NDSS Workshop on Automotive and Autonomous Vehicle Security (AutoSec)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In Autonomous Driving (AD) systems, perception is both security and safety critical. Despite various prior studies on its security issues, all of them only consider attacks on cameraor LiDAR-based AD perception alone. However, production AD systems today predominantly adopt a Multi-Sensor Fusion (MSF) based design, which in principle can be more robust against these attacks under the assumption that not all fusion sources are (or can be) attacked at the same time. In this paper, we present the first study of security issues of MSF-based perception in AD systems. We directly challenge the basic MSF design assumption above by exploring the possibility of attacking all fusion sources simultaneously. This allows us for the first time to understand how much security guarantee MSF can fundamentally provide as a general defense strategy for AD perception. We formulate the attack as an optimization problem to generate a physically-realizable, adversarial 3D-printed object that misleads an AD system to fail in detecting it and thus crash into it. To systematically generate such a physical-world attack, we propose a novel attack pipeline that addresses two main design challenges: (1) non-differentiable target camera and LiDAR sensing systems, and (2) non-differentiable cell-level aggregated features popularly used in LiDAR-based AD perception. We evaluate our attack on MSF algorithms included in representative open-source industry-grade AD systems in real-world driving scenarios. Our results show that the attack achieves over 90% success rate across different object types and MSF algorithms. Our attack is also found stealthy, robust to victim positions, transferable across MSF algorithms, and physical-world realizable after being 3D-printed and captured by LiDAR and camera devices. To concretely assess the end-to-end safety impact, we further perform simulation evaluation and show that it can cause a 100% vehicle collision rate for an industry-grade AD system. We also evaluate and discuss defense strategies. 
    more » « less
  2. In high-level Autonomous Driving (AD) systems, behavioral planning is in charge of making high-level driving decisions such as cruising and stopping, and thus highly securitycritical. In this work, we perform the first systematic study of semantic security vulnerabilities specific to overly-conservative AD behavioral planning behaviors, i.e., those that can cause failed or significantly-degraded mission performance, which can be critical for AD services such as robo-taxi/delivery. We call them semantic Denial-of-Service (DoS) vulnerabilities, which we envision to be most generally exposed in practical AD systems due to the tendency for conservativeness to avoid safety incidents. To achieve high practicality and realism, we assume that the attacker can only introduce seemingly-benign external physical objects to the driving environment, e.g., off-road dumped cardboard boxes. To systematically discover such vulnerabilities, we design PlanFuzz, a novel dynamic testing approach that addresses various problem-specific design challenges. Specifically, we propose and identify planning invariants as novel testing oracles, and design new input generation to systematically enforce problemspecific constraints for attacker-introduced physical objects. We also design a novel behavioral planning vulnerability distance metric to effectively guide the discovery. We evaluate PlanFuzz on 3 planning implementations from practical open-source AD systems, and find that it can effectively discover 9 previouslyunknown semantic DoS vulnerabilities without false positives. We find all our new designs necessary, as without each design, statistically significant performance drops are generally observed. We further perform exploitation case studies using simulation and real-vehicle traces. We discuss root causes and potential fixes. 
    more » « less
  3. Data-driven driving safety assessment is crucial in understanding the insights of traffic accidents caused by dangerous driving behaviors. Meanwhile, quantifying driving safety through well-defined metrics in real-world naturalistic driving data is also an important step for the operational safety assessment of automated vehicles (AV). However, the lack of flexible data acquisition methods and fine-grained datasets has hindered progress in this critical area. In response to this challenge, we propose a novel dataset for driving safety metrics analysis specifically tailored to car-following situations. Leveraging state-of-the-art Artificial Intelligence (AI) technology, we employ drones to capture high-resolution video data at 12 traffic scenes in the Phoenix metropolitan area. After that, we developed advanced computer vision algorithms and semantically annotated maps to extract precise vehicle trajectories and leader-follower relations among vehicles. These components, in conjunction with a set of defined metrics based on our prior work on Operational Safety Assessment (OSA) by the Institute of Automated Mobility (IAM), allow us to conduct a detailed analysis of driving safety. Our results reveal the distribution of these metrics under various real-world car-following scenarios and characterize the impact of different parameters and thresholds in the metrics. By enabling a data-driven approach to address driving safety in car-following scenarios, our work can empower traffic operators and policymakers to make informed decisions and contribute to a safer, more efficient future for road transportation systems. 
    more » « less
  4. A modern automobile system is a safety-critical distributed embedded system that incorporates more than a hundred Electronic Control Units, a wide range of sensors, and actuators, all connected with several in-vehicle networks. Obviously, integration of these heterogeneous components can lead to subtle errors that can be possibly exploited by malicious entities in the field, resulting in catastrophic consequences. We develop a prototyping platform to enable the functional safety and security exploration of automotive systems. The platform realizes a unique, extensible virtualization environment for the exploration of vehicular systems. The platform includes a CAN simulator that mimics the vehicular CAN bus to interact with various ECUs, together with sensory and actuation capabilities. We show how to explore these capabilities in the safety and security exploration through the analysis of a representative vehicular use case interaction. 
    more » « less
  5. With the rapidly increasing capabilities and adoption of code agents for AI-assisted coding and software development, safety and security concerns, such as generating or executing malicious code, have become significant barriers to the real-world deployment of these agents. To provide comprehensive and practical evaluations on the safety of code agents, we propose RedCode, an evaluation platform with benchmarks grounded in four key principles: real interaction with systems, holistic evaluation of unsafe code generation and execution, diverse input formats, and high-quality safety scenarios and tests. RedCode consists of two parts to evaluate agents’ safety in unsafe code execution and generation: (1) RedCode-Exec provides challenging code prompts in Python as inputs, aiming to evaluate code agents’ ability to recognize and handle unsafe code. We then map the Python code to other programming languages (e.g., Bash) and natural text summaries or descriptions for evaluation, leading to a total of over 4,000 testing instances. We provide 25 types of critical vulnerabilities spanning various domains, such as websites, file systems, and operating systems. We provide a Docker sandbox environment to evaluate the execution capabilities of code agents and design corresponding evaluation metrics to assess their execution results. (2) RedCode-Gen provides 160 prompts with function signatures and docstrings as input to assess whether code agents will follow instructions to generate harmful code or software. Our empirical findings, derived from evaluating three agent frameworks based on 19 LLMs, provide insights into code agents’ vulnerabilities. For instance, evaluations on RedCode-Exec show that agents are more likely to reject executing unsafe operations on the operating system, but are less likely to reject executing technically buggy code, indicating high risks. Unsafe operations described in natural text lead to a lower rejection rate than those in code format. Additionally, evaluations on RedCode-Gen reveal that more capable base models and agents with stronger overall coding abilities, such as GPT4, tend to produce more sophisticated and effective harmful software. Our findings highlight the need for stringent safety evaluations for diverse code agents. Our dataset and code are publicly available at https://github.com/AI-secure/RedCode. 
    more » « less