skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 29, 2025

Title: Enhanced Cooperative Perception for Autonomous Vehicles Using Imperfect Communication
Sharing and joint processing of camera feeds and sensor measurements, known as Cooperative Perception (CP), has emerged as a new technique to achieve higher perception qualities. CP can enhance the safety of Autonomous Vehicles (AVs) where their individual visual perception quality is compromised by adverse weather conditions (haze as foggy weather), low illumination, winding roads, and crowded traffic. While previous CP methods have shown success in elevating perception quality, they often assume perfect communication conditions and unlimited transmission resources to share camera feeds, which may not hold in real-world scenarios. Also, they make no effort to select better helpers when multiple options are available.To cover the limitations of former methods, in this paper, we propose a novel approach to realize an optimized CP under constrained communications. At the core of our approach is recruiting the best helper from the available list of front vehicles to augment the visual range and enhance the Object Detection (OD) accuracy of the ego vehicle. In this two-step process, we first select the helper vehicles that contribute the most to CP based on their visual range and lowest motion blur. Next, we implement a radio block optimization among the candidate vehicles to further improve communication efficiency. We specifically focus on pedestrian detection as an exemplary scenario. To validate our approach, we used the CARLA simulator to create a dataset of annotated videos for different driving scenarios where pedestrian detection is challenging for an AV with compromised vision. Our results demonstrate the efficacy of our two-step optimization process in improving the overall performance of cooperative perception in challenging scenarios, substantially improving driving safety under adverse conditions. Finally, we note that the networking assumptions are adopted from LTE Release 14 Mode 4 side-link communication, commonly used for Vehicle-to-Vehicle (V2V) commun  more » « less
Award ID(s):
2204721
PAR ID:
10542199
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)
Date Published:
ISBN:
979-8-3503-6944-1
Page Range / eLocation ID:
700 to 707
Format(s):
Medium: X
Location:
Abu Dhabi, United Arab Emirates
Sponsoring Org:
National Science Foundation
More Like this
  1. Interest in cooperative perception is growing quickly due to its remarkable performance in improving perception capabilities for connected and automated vehicles. This improvement is crucial, especially for automated driving scenarios in which perception performance is one of the main bottlenecks to the development of safety and efficiency. However, current cooperative perception methods typically assume that all collaborating vehicles have enough communication bandwidth to share all features with an identical spatial size, which is impractical for real-world scenarios. In this paper, we propose Adaptive Cooperative Perception, a new cooperative perception framework that is not limited by the aforementioned assumptions, aiming to enable cooperative perception under more realistic and challenging conditions. To support this, a novel feature encoder is proposed and named Pillar Attention Encoder. A pillar attention mechanism is designed to extract the feature data while considering its significance for the perception task. An adaptive feature filter is proposed to adjust the size of the feature data for sharing by considering the importance value of the feature. Experiments are conducted for cooperative object detection from multiple vehicle-based and infrastructure-based LiDAR sensors under various communication conditions. Results demonstrate that our method can successfully handle dynamic communication conditions and improve the mean Average Precision by 10.18% when compared with the state-of-the-art feature encoder. 
    more » « less
  2. null (Ed.)
    Vehicle detection with visual sensors like lidar and camera is one of the critical functions enabling autonomous driving. While they generate fine-grained point clouds or high-resolution images with rich information in good weather conditions, they fail in adverse weather (e.g., fog) where opaque particles distort lights and significantly reduce visibility. Thus, existing methods relying on lidar or camera experience significant performance degradation in rare but critical adverse weather conditions. To remedy this, we resort to exploiting complementary radar, which is less impacted by adverse weather and becomes prevalent on vehicles. In this paper, we present Multimodal Vehicle Detection Network (MVDNet), a two-stage deep fusion detector, which first generates proposals from two sensors and then fuses region-wise features between multimodal sensor streams to improve final detection results. To evaluate MVDNet, we create a procedurally generated training dataset based on the collected raw lidar and radar signals from the open-source Oxford Radar Robotcar. We show that the proposed MVDNet surpasses other state-of-the-art methods, notably in terms of Average Precision (AP), especially in adverse weather conditions. The code and data are available at https://github.com/qiank10/MVDNet. 
    more » « less
  3. Connected and Automated Vehicles (CAVs) have the potential to enhance traffic safety and efficiency. In contrast, aligning both vehicles’ utility with system-level interests in scenarios with conflicting road rights is challenging, hindering cooperative driving. This paper advocates a game theory model, which strategically incorporates deceptive information within incomplete information vehicle games, operating under the premise of imprecise perceptions. The equilibria derived reveal that CAVs can exploit deceptive strategies, not only gaining advantages that undermine the utility of the other vehicle in the game but also posing hazards to the overall benefits of the transportation system. Vast experiments were conducted, simulating diverse inbound traffic conditions at an intersection, validating the detrimental impact on efficiency and safety resulting from CAVs with perception uncertainties, and employing deceptive maneuvers within connected and automated transportation systems. Finally, the paper proposes feasible solutions and potential countermeasures to address the adverse consequences of deception in connected and automated transportation systems. It concludes by calling for integrating these insights into future research endeavors and pursuing to fully realize the potential and expectations of CAVs in enhancing the whole traffic performance. 
    more » « less
  4. Utilizing the latest advances in Artificial Intelligence (AI), the computer vision community is now witnessing an unprecedented evolution in all kinds of perception tasks, particularly in object detection. Based on multiple spatially separated perception nodes, Cooperative Perception (CP) has emerged to significantly advance the perception of automated driving. However, current cooperative object detection methods mainly focus on ego-vehicle efficiency without considering the practical issues of system-wide costs. In this paper, we introduce VINet, a unified deep learning-based CP network for scalable, lightweight, and heterogeneous cooperative 3D object detection. VINet is the first CP method designed from the standpoint of large-scale systemlevel implementation and can be divided into three main phases: (1) Global Pre-Processing and Lightweight Feature Extraction which prepare the data into global style and extract features for cooperation in a lightweight manner; (2) Two-Stream Fusion which fuses the features from scalable and heterogeneous perception nodes; and (3) Central Feature Backbone and 3D Detection Head which further process the fused features and generate cooperative detection results. An open-source data experimental platform is designed and developed for CP dataset acquisition and model evaluation. The experimental analysis shows that VINet can reduce 84% system-level computational cost and 94% system-level communication cost while improving the 3D detection accuracy. 
    more » « less
  5. Vehicle-to-pedestrian communication could significantly improve pedestrian safety at signalized intersections. However, it is unlikely that pedestrians will typically be carrying a low latency communication-enabled device with an activated pedestrian safety application in their hand-held device all the time. Because of this, multiple traffic cameras at a signalized intersection could be used to accurately detect and locate pedestrians using deep learning, and broadcast safety alerts related to pedestrians to warn connected and automated vehicles around signalized intersections. However, the unavailability of high-performance roadside computing infrastructure and the limited network bandwidth between traffic cameras and the computing infrastructure limits the ability of real-time data streaming and processing for pedestrian detection. In this paper, we describe an edge computing-based real-time pedestrian detection strategy that combines a pedestrian detection algorithm using deep learning and an efficient data communication approach to reduce bandwidth requirements while maintaining high pedestrian detection accuracy. We utilize a lossy compression technique on traffic camera data to determine the tradeoff between the reduction of the communication bandwidth requirements and a defined pedestrian detection accuracy. The performance of the pedestrian detection strategy is measured in relation to pedestrian classification accuracy with varying peak signal-to-noise ratios. The analyses reveal that we detect pedestrians by maintaining a defined detection accuracy with a peak signal-to-noise ratio 43 dB while reducing the communication bandwidth from 9.82 Mbits/sec to 0.31 Mbits/sec, a 31Ă— reduction. 
    more » « less