skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Title: WIP: Towards the Practicality of the Adversarial Attack on Object Tracking in Autonomous Driving
Recently, adversarial examples against object detection have been widely studied. However, it is difficult for these attacks to have an impact on visual perception in autonomous driving because the complete visual pipeline of real-world autonomous driving systems includes not only object detection but also object tracking. In this paper, we present a novel tracker hijacking attack against the multi-target tracking algorithm employed by real-world autonomous driving systems, which controls the bounding box of object detection to spoof the multiple object tracking process. Our approach exploits the detection box generation process of the anchor-based object detection algorithm and designs new optimization methods to generate adversarial patches that can successfully perform tracker hijacking attacks, causing security risks. The evaluation results show that our approach has 85% attack success rate on two detection models employed by real-world autonomous driving systems. We discuss our potential next step for this work.  more » « less
Award ID(s):
2145493 1932464 1929771
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ISOC Symposium on Vehicle Security and Privacy (VehicleSec)
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent work in adversarial machine learning started to focus on the visual perception in autonomous driving and studied Adversarial Examples (AEs) for object detection models. However, in such visual perception pipeline the detected objects must also be tracked, in a process called Multiple Object Tracking (MOT), to build the moving trajectories of surrounding obstacles. Since MOT is designed to be robust against errors in object detection, it poses a general challenge to existing attack techniques that blindly target objection detection: we find that a success rate of over 98% is needed for them to actually affect the tracking results, a requirement that no existing attack technique can satisfy. In this paper, we are the first to study adversarial machine learning attacks against the complete visual perception pipeline in autonomous driving, and discover a novel attack technique, tracker hijacking, that can effectively fool MOT using AEs on object detection. Using our technique, successful AEs on as few as one single frame can move an existing object in to or out of the headway of an autonomous vehicle to cause potential safety hazards. We perform evaluation using the Berkeley Deep Drive dataset and find that on average when 3 frames are attacked, our attack can have a nearly 100% success rate while attacks that blindly target object detection only have up to 25%. 
    more » « less
  2. Modern autonomous systems rely on both object detection and object tracking in their visual perception pipelines. Although many recent works have attacked the object detection component of autonomous vehicles, these attacks do not work on full pipelines that integrate object tracking to enhance the object detector's accuracy. Meanwhile, existing attacks against object tracking either lack real-world applicability or do not work against a powerful class of object trackers, Siamese trackers. In this paper, we present AttrackZone, a new physically-realizable tracker hijacking attack against Siamese trackers that systematically determines valid regions in an environment that can be used for physical perturbations. AttrackZone exploits the heatmap generation process of Siamese Region Proposal Networks in order to take control of an object's bounding box, resulting in physical consequences including vehicle collisions and masked intrusion of pedestrians into unauthorized areas. Evaluations in both the digital and physical domain show that AttrackZone achieves its attack goals 92% of the time, requiring only 0.3-3 seconds on average. 
    more » « less
  3. null (Ed.)
    In Autonomous Driving (AD) systems, perception is both security and safety critical. Despite various prior studies on its security issues, all of them only consider attacks on cameraor LiDAR-based AD perception alone. However, production AD systems today predominantly adopt a Multi-Sensor Fusion (MSF) based design, which in principle can be more robust against these attacks under the assumption that not all fusion sources are (or can be) attacked at the same time. In this paper, we present the first study of security issues of MSF-based perception in AD systems. We directly challenge the basic MSF design assumption above by exploring the possibility of attacking all fusion sources simultaneously. This allows us for the first time to understand how much security guarantee MSF can fundamentally provide as a general defense strategy for AD perception. We formulate the attack as an optimization problem to generate a physically-realizable, adversarial 3D-printed object that misleads an AD system to fail in detecting it and thus crash into it. To systematically generate such a physical-world attack, we propose a novel attack pipeline that addresses two main design challenges: (1) non-differentiable target camera and LiDAR sensing systems, and (2) non-differentiable cell-level aggregated features popularly used in LiDAR-based AD perception. We evaluate our attack on MSF algorithms included in representative open-source industry-grade AD systems in real-world driving scenarios. Our results show that the attack achieves over 90% success rate across different object types and MSF algorithms. Our attack is also found stealthy, robust to victim positions, transferable across MSF algorithms, and physical-world realizable after being 3D-printed and captured by LiDAR and camera devices. To concretely assess the end-to-end safety impact, we further perform simulation evaluation and show that it can cause a 100% vehicle collision rate for an industry-grade AD system. We also evaluate and discuss defense strategies. 
    more » « less
  4. FPGA virtualization has garnered significant industry and academic interests as it aims to enable multi-tenant cloud systems that can accommodate multiple users' circuits on a single FPGA. Although this approach greatly enhances the efficiency of hardware resource utilization, it also introduces new security concerns. As a representative study, one state-of-the-art (SOTA) adversarial fault injection attack, named Deep-Dup, exemplifies the vulnerabilities of off-chip data communication within the multi-tenant cloud-FPGA system. Deep-Dup attacks successfully demonstrate the complete failure of a wide range of Deep Neural Networks (DNNs) in a black-box setup, by only injecting fault to extremely small amounts of sensitive weight data transmissions, which are identified through a powerful differential evolution searching algorithm. Such emerging adversarial fault injection attack reveals the urgency of effective defense methodology to protect DNN applications on the multi-tenant cloud-FPGA system. This paper, for the first time, presents a novel moving-target-defense (MTD) oriented defense framework DeepShuffle, which could effectively protect DNNs on multi-tenant cloud-FPGA against the SOTA Deep-Dup attack, through a novel lightweight model parameter shuffling methodology. DeepShuffle effectively counters the Deep-Dup attack by altering the weight transmission sequence, which effectively prevents adversaries from identifying security-critical model parameters from the repeatability of weight transmission during each inference round. Importantly, DeepShuffle represents a training-free DNN defense methodology, which makes constructive use of the typologies of DNN architectures to achieve being lightweight. Moreover, the deployment of DeepShuffle neither requires any hardware modification nor suffers from any performance degradation. We evaluate DeepShuffle on the SOTA open-source FPGA-DNN accelerator, Vertical Tensor Accelerator (VTA), which represents the practice of real-world FPGA-DNN system developers. We then evaluate the performance overhead of DeepShuffle and find it only consumes an additional ~3% of the inference time compared to the unprotected baseline. DeepShuffle improves the robustness of various SOTA DNN architectures like VGG, ResNet, etc. against Deep-Dup by orders. It effectively reduces the efficacy of evolution searching-based adversarial fault injection attack close to random fault injection attack, e.g., on VGG-11, even after increasing the attacker's effort by 2.3x, our defense shows a ~93% improvement in accuracy, compared to the unprotected baseline. 
    more » « less
  5. Abstract—Object perception plays a fundamental role in Cooperative Driving Automation (CDA) which is regarded as a revolutionary promoter for next-generation transportation systems. However, the vehicle-based perception may suffer from the limited sensing range and occlusion as well as low penetration rates in connectivity. In this paper, we propose Cyber Mobility Mirror (CMM), a next-generation real-world object perception system for 3D object detection, tracking, localization, and reconstruction, to explore the potential of roadside sensors for enabling CDA in the real world. The CMM system consists of six main components: i) the data pre-processor to retrieve and preprocess the raw data; ii) the roadside 3D object detector to generate 3D detection results; iii) the multi-object tracker to identify detected objects; iv) the global locator to generate geo-localization information; v) the mobile-edge-cloud-based communicator to transmit perception information to equipped vehicles, and vi) the onboard advisor to reconstruct and display the real-time traffic conditions. An automatic perception evaluation approach is proposed to support the assessment of data-driven models without human-labeling requirements and a CMM field-operational system is deployed at a real-world intersection to assess the performance of the CMM. Results from field tests demonstrate that our CMM prototype system can achieve 96.99% precision and 83.62% recall for detection and 73.55% ID-recall for tracking. High-fidelity real-time traffic conditions (at the object level) can be geo-localized with a root-mean-square error (RMSE) of 0.69m and 0.33m for lateral and longitudinal direction, respectively, and displayed on the GUI of the equipped vehicle with a frequency of 3 − 4Hz. 
    more » « less