skip to main content


Title: Machine-Learning-Based Real-Time Multi-Camera Vehicle Tracking and Travel-Time Estimation
Travel-time estimation of traffic flow is an important problem with critical implications for traffic congestion analysis. We developed techniques for using intersection videos to identify vehicle trajectories across multiple cameras and analyze corridor travel time. Our approach consists of (1) multi-object single-camera tracking, (2) vehicle re-identification among different cameras, (3) multi-object multi-camera tracking, and (4) travel-time estimation. We evaluated the proposed framework on real intersections in Florida with pan and fisheye cameras. The experimental results demonstrate the viability and effectiveness of our method.  more » « less
Award ID(s):
1922782
NSF-PAR ID:
10332841
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of Imaging
Volume:
8
Issue:
4
ISSN:
2313-433X
Page Range / eLocation ID:
101
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The traffic congestion hits most big cities in the world - threatening long delays and serious reductions in air quality. City and local government officials continue to face challenges in optimizing crowd flow, synchronizing traffic and mitigating threats or dangerous situations. One of the major challenges faced by city planners and traffic engineers is developing a robust traffic controller that eliminates traffic congestion and imbalanced traffic flow at intersections. Ensuring that traffic moves smoothly and minimizing the waiting time in intersections requires automated vehicle detection techniques for controlling the traffic light automatically, which are still challenging problems. In this paper, we propose an intelligent traffic pattern collection and analysis model, named TPCAM, based on traffic cameras to help in smooth vehicular movement on junctions and set to reduce the traffic congestion. Our traffic detection and pattern analysis model aims at detecting and calculating the traffic flux of vehicles and pedestrians at intersections in real-time. Our system can utilize one camera to capture all the traffic flows in one intersection instead of multiple cameras, which will reduce the infrastructure requirement and potential for easy deployment. We propose a new deep learning model based on YOLOv2 and adapt the model for the traffic detection scenarios. To reduce the network burdens and eliminate the deployment of network backbone at the intersections, we propose to process the traffic video data at the network edge without transmitting the big data back to the cloud. To improve the processing frame rate at the edge, we further propose deep object tracking algorithm leveraging adaptive multi-modal models and make it robust to object occlusions and varying lighting conditions. Based on the deep learning based detection and tracking, we can achieve pseudo-30FPS via adaptive key frame selection. 
    more » « less
  2. Camera-based systems are increasingly used for collecting information on intersections and arterials. Unlike loop controllers that can generally be only used for detection and movement of vehicles, cameras can provide rich information about the traffic behavior. Vision-based frameworks for multiple-object detection, object tracking, and near-miss detection have been developed to derive this information. However, much of this work currently addresses processing videos offline. In this article, we propose an integrated two-stream convolutional networks architecture that performs real-time detection, tracking, and near-accident detection of road users in traffic video data. The two-stream model consists of a spatial stream network for object detection and a temporal stream network to leverage motion features for multiple-object tracking. We detect near-accidents by incorporating appearance features and motion features from these two networks. Further, we demonstrate that our approaches can be executed in real-time and at a frame rate that is higher than the video frame rate on a variety of videos collected from fisheye and overhead cameras. 
    more » « less
  3. The operational safety of Automated Driving System (ADS)-Operated Vehicles (AVs) are a rising concern with the deployment of AVs as prototypes being tested and also in commercial deployment. The robustness of safety evaluation systems is essential in determining the operational safety of AVs as they interact with human-driven vehicles. Extending upon earlier works of the Institute of Automated Mobility (IAM) that have explored the Operational Safety Assessment (OSA) metrics and infrastructure-based safety monitoring systems, in this work, we compare the performance of an infrastructure-based Light Detection And Ranging (LIDAR) system to an onboard vehicle-based LIDAR system in testing at the Maricopa County Department of Transportation SMARTDrive testbed in Anthem, Arizona. The sensor modalities are located in infrastructure and onboard the test vehicles, including LIDAR, cameras, a real-time differential GPS, and a drone with a camera. Bespoke localization and tracking algorithms are created for the LIDAR and cameras. In total, there are 26 different scenarios of the test vehicles navigating the testbed intersection; for this work, we are only considering car following scenarios. The LIDAR data collected from the infrastructure-based and onboard vehicle-based sensors system are used to perform object detection and multi-target tracking to estimate the velocity and position information of the test vehicles and use these values to compute OSA metrics. The comparison of the performance of the two systems involves the localization and tracking errors in calculating the position and the velocity of the subject vehicle, with the real-time differential GPS data serving as ground truth for velocity comparison and tracking results from the drone for OSA metrics comparison.

     
    more » « less
  4. In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters that directly affect the quality of the video stream capture. While a few of such parameters, e.g., exposure, focus, white balance are automatically adjusted by the camera internally, the remaining ones are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this paper, we first show that environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. We then present CamTuner, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CamTuner is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%--4.2% additional cars (without any false positives) in a large enterprise parking lot and 9.7% additional cars in a 5G smart traffic intersection scenario, which enables a new usecase of accurate and reliable automatic vehicle collision prediction (AVCP). CamTuner opens doors for new ways to significantly enhance video analytics accuracy beyond incremental improvements from refining deep-learning models. 
    more » « less
  5. Counting multi-vehicle motions via traffic cameras in urban areas is crucial for smart cities. Even though several frameworks have been proposed in this task, there is no prior work focusing on the highly common, dense and size-variant vehicles such as motorcycles. In this paper, we propose a novel framework for vehicle motion counting with adaptive label-independent tracking and counting modules that processes 12 frames per second. Our framework adapts hyperparameters for multi-vehicle tracking and properly works in complex traffic conditions, especially invariant to camera perspectives. We achieved the competitive results in terms of root-mean-square error and runtime performance. 
    more » « less