skip to main content

Title: A Pedestrian Detection and Tracking Framework for Autonomous Cars: Efficient Fusion of Camera and LiDAR Data
This paper presents a novel method for pedestrian detection and tracking by fusing camera and LiDAR sensor data. To deal with the challenges associated with the autonomous driving scenarios, an integrated tracking and detection framework is proposed. The detection phase is performed by converting LiDAR streams to computationally tractable depth images, and then, a deep neural network is developed to identify pedestrian candidates both in RGB and depth images. To provide accurate information, the detection phase is further enhanced by fusing multi-modal sensor information using the Kalman filter. The tracking phase is a combination of the Kalman filter prediction and an optical flow algorithm to track multiple pedestrians in a scene. We evaluate our framework on a real public driving dataset. Experimental results demonstrate that the proposed method achieves significant performance improvement over a baseline method that solely uses image-based pedestrian detection.
Authors:
; ;
Award ID(s):
2018879 2000320
Publication Date:
NSF-PAR ID:
10326311
Journal Name:
2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Page Range or eLocation-ID:
1287 to 1292
Sponsoring Org:
National Science Foundation
More Like this
  1. 3D LiDAR scanners are playing an increasingly important role in autonomous driving as they can generate depth information of the environment. However, creating large 3D LiDAR point cloud datasets with point-level labels requires a significant amount of manual annotation. This jeopardizes the efficient development of supervised deep learning algorithms which are often data-hungry. We present a framework to rapidly create point clouds with accurate pointlevel labels from a computer game. To our best knowledge, this is the first publication on LiDAR point cloud simulation framework for autonomous driving. The framework supports data collection from both auto-driving scenes and user-configured scenes. Point clouds from auto-driving scenes can be used as training data for deep learning algorithms, while point clouds from user-configured scenes can be used to systematically test the vulnerability of a neural network, and use the falsifying examples to make the neural network more robust through retraining. In addition, the scene images can be captured simultaneously in order for sensor fusion tasks, with a method proposed to do automatic registration between the point clouds and captured scene images. We show a significant improvement in accuracy (+9%) in point cloud segmentation by augmenting the training dataset with the generated synthesized data.more »Our experiments also show by testing and retraining the network using point clouds from user-configured scenes, the weakness/blind spots of the neural network can be fixed.« less
  2. This paper presents a Multiplicative Extended Kalman Filter (MEKF) framework using a state-of-the-art velocimeter Light Detection and Ranging (LIDAR) sensor for Terrain Relative Navigation (TRN) applications. The newly developed velocimeter LIDAR is capable of providing simultaneous position, Doppler velocity, and reflectivity measurements for every point in the point cloud. This information, along with pseudo-measurements from point cloud registration techniques, a novel bulk velocity batch state estimation process and inertial measurement data, is fused within a traditional Kalman filter architecture. Results from extensive emulation robotics experiments performed at Texas A&M’s Land, Air, and Space Robotics (LASR) laboratory and Monte Carlo simulations are presented to evaluate the efficacy of the proposed algorithms.
  3. This work attempts to answer two problems. (1) Can we use the odometry information from two different Simultaneous Localization And Mapping (SLAM) algorithms to get a better estimate of the odometry? and (2) What if one of the SLAM algorithms gets affected by shot noise or by attack vectors, and can we resolve this situation? To answer the first question we focus on fusing odometries from Lidar-based SLAM and Visualbased SLAM using the Extended Kalman Filter (EKF) algorithm. The second question is answered by introducing the Maximum Correntropy Criterion - Extended Kalman Filter (MCC-EKF), which assists in removing/minimizing shot noise or attack vectors injected into the system. We manually simulate the shot noise and see how our system responds to the noise vectors. We also evaluate our approach on KITTI dataset for self-driving cars.
  4. This paper presents a navigation system for autonomous rendezvous, proximity operations, and docking (RPOD) with respect to non-cooperative space objects using a novel velocimeter light detection and ranging (LIDAR) sensor. Given only raw position and Doppler velocity measurements, the proposed methodology is capable of estimating the six degree-of-freedom (DOF) relative velocity without any a priori information regarding the body of interest. Further, the raw Doppler velocity measurement field directly exposes the body of interest’s center of rotation (i.e. center of mass) enabling precise 6-DOF pose estimation if the rate estimates are fused within a Kalman filter architecture. These innovative techniques are computationally inexpensive and do not require information from peripheral sensors (i.e. gyroscope, magnetometer, accelerometer etc.). The efficacy of the proposed algorithms were evaluated via emulation robotics experiments at the Land, Air and Space Robotics (LASR) laboratory at Texas A&M University. Although testing was completed with a single body of interest, this approach can be used to online estimate the 6-DOF relative velocity of any amount of non-cooperative bodies within the field-of-view.
  5. Vehicle to Vehicle (V2V) communication allows vehicles to wirelessly exchange information on the surrounding environment and enables cooperative perception. It helps prevent accidents, increase the safety of the passengers, and improve the traffic flow efficiency. However, these benefits can only come when the vehicles can communicate with each other in a fast and reliable manner. Therefore, we investigated two areas to improve the communication quality of V2V: First, using beamforming to increase the bandwidth of V2V communication by establishing accurate and stable collaborative beam connection between vehicles on the road; second, ensuring scalable transmission to decrease the amount of data to be transmitted, thus reduce the bandwidth requirements needed for collaborative perception of autonomous driving vehicles. Beamforming in V2V communication can be achieved by utilizing image-based and LIDAR’s 3D data-based vehicle detection and tracking. For vehicle detection and tracking simulation, we tested the Single Shot Multibox Detector deep learning-based object detection method that can achieve a mean Average Precision of 0.837 and the Kalman filter for tracking. For scalable transmission, we simulate the effect of varying pixel resolutions as well as different image compression techniques on the file size of data. Results show that without compression, the file size formore »only transmitting the bounding boxes containing detected object is up to 10 times less than the original file size. Similar results are also observed when the file is compressed by lossless and lossy compression to varying degrees. Based on these findings using existing databases, the impact of these compression methods and methods of effectively combining feature maps on the performance of object detection and tracking models will be further tested in the real-world autonomous driving system.« less