Deep learning models have significantly improved object detection, which is essential for visual sensing. However, their increasing complexity results in higher latency and resource consumption, making real-time object detection challenging. In order to address the challenge, we propose a new lightweight filtering method called L-filter to predict empty video frames that include no object of interest (e.g., vehicles) with high accuracy via hybrid time series analysis. L-filter drops those frames deemed empty and conducts object detection for nonempty frames only, significantly enhancing the frame processing rate and scalability of real-time object detection. Our evaluation demonstrates that L-filter improves the frame processing rate by 31–47% for a single traffic video stream compared to three standalone state-of-the-art object detection models without L-filter. Additionally, L-filter significantly enhances scalability; it can process up to six concurrent video streams in one commodity GPU, supporting over 57 fps per stream, by working alongside the fastest object detection model among the three models.
more »
« less
Preprocessing via Deep Learning for Enhancing Real-Time Performance of Object Detection
Deep learning models have significantly improved object detection essential for traffic monitoring. However, these models’ increasing complexity results in higher latency and resource consumption, making real-time object detection challenging. To address this issue, we propose a lightweight deep learning model called Empty Road Detection (ERD). ERD efficiently identifies and removes empty traffic images that do not contain any object of interest, such as vehicles, via binary classification. By serving as a preprocessing unit, ERD filters out nonessential data, reducing computational complexity and latency. ERD is highly compatible and can work seamlessly with any third-party object detection model. In our evaluation, we found that ERD improves the frame processing rate of EfficientDet, SSD, and YOLOV5 by approximately 44%, 40%, and 10%, respectively, for a real-world traffic monitoring video.
more »
« less
- Award ID(s):
- 2007854
- PAR ID:
- 10435826
- Date Published:
- Journal Name:
- IEEE 97th Vehicular Technology Conference (VTC-2023 Spring)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Real-time object detection is essential for AI-based intelligent traffic management. However, growing complexities of deep learning models for object detection cause increased latency and resource requirements. To tackle the challenge, we introduce a new approach, named AROD (Adaptive Real-Time Object Detection), that infers the pixel motion speed in continuous traffic video frames and skips redundant frames when the pixel velocity is low. Thereby, AROD aims to significantly enhance the efficiency and scalability, sustaining the accuracy of object detection. Our evaluation using real-world traffic videos reveals that our method for pixel velocity inference via lightweight deep learning reduces the RMSE (Root Mean Square Error) by up to two orders of magnitude compared to state-of-the-art approaches. AROD improves the frame processing rate of YOLOv5, SSD, and EfficientDet by approximately 32-61\%, 110-174\%, and 120-213\%, respectively. AROD considerably enhances scalability by supporting real-time object detection for up to three concurrent traffic video streams on a commodity machine. Moreover, AROD demonstrates its generalizability by supporting competitive accuracy in object detection for a separate traffic video that was fully hidden during training.more » « less
-
In this paper, an urban object detection system via unmanned aerial vehicles (UAVs) is developed to collect real-time traffic information, which can be further utilized in many applications such as traffic monitoring and urban traffic management. The system includes an object detection algorithm, deep learning model training, and deployment on a real UAV. For the object detection algorithm, the Mobilenet-SSD model is applied owing to its lightweight and efficiency, which make it suitable for real-time applications on an onboard microprocessor. For model training, federated learning (FL) is used to protect privacy and increase efficiency with parallel computing. Last, the FL-trained object detection model is deployed on a real UAV for real-time performance testing. The experimental results show that the object detection algorithm can reach a speed of 18 frames per second with good detection performance, which shows the real-time computation ability of a resource-limited edge device and also validates the effectiveness of the developed system.more » « less
-
Traffic intersections are prime locations for deployment of infrastructure sensors and edge computing nodes to realize the vision of a smart city. It is expected that the needs of a smart city, in regards to traffic and pedestrian traffic systems monitored by cameras/video, can be met by using stateof-the-art artificial-intelligence (AI) based object detectors and trackers. A critical component in designing an effective real-time object detection/tracking pipeline is the understanding of how object density, i.e., the number of objects in a scene, and imageresolution and frame rate influence the performance metrics. This study explores the accuracy and speed metrics with the goal of supporting pipelines that meet the precision and latency needs of a real-time environment. We examine the impact of varying image-resolution, frame rate and object-density on the object detection performance metrics. The experiments on the COSMOS testbed dataset show that varying the frame width from 416 pixels to 832 pixels, and cropping the images to a square resolution, result in the increase in average precision for all object classes. Decreasing the frame rate from 15 fps to 5 fps preserves more than 90% of the highest F1 score achieved for all object classes. The results inform the choice of video preprocessing stages, modifications to established AI-based object detection/tracking methods, and suggest optimal hyper-parameter values. Index Terms—Object Detection, Smart City, Video Resolution, Deep Learning Models.more » « less
-
The traffic congestion hits most big cities in the world - threatening long delays and serious reductions in air quality. City and local government officials continue to face challenges in optimizing crowd flow, synchronizing traffic and mitigating threats or dangerous situations. One of the major challenges faced by city planners and traffic engineers is developing a robust traffic controller that eliminates traffic congestion and imbalanced traffic flow at intersections. Ensuring that traffic moves smoothly and minimizing the waiting time in intersections requires automated vehicle detection techniques for controlling the traffic light automatically, which are still challenging problems. In this paper, we propose an intelligent traffic pattern collection and analysis model, named TPCAM, based on traffic cameras to help in smooth vehicular movement on junctions and set to reduce the traffic congestion. Our traffic detection and pattern analysis model aims at detecting and calculating the traffic flux of vehicles and pedestrians at intersections in real-time. Our system can utilize one camera to capture all the traffic flows in one intersection instead of multiple cameras, which will reduce the infrastructure requirement and potential for easy deployment. We propose a new deep learning model based on YOLOv2 and adapt the model for the traffic detection scenarios. To reduce the network burdens and eliminate the deployment of network backbone at the intersections, we propose to process the traffic video data at the network edge without transmitting the big data back to the cloud. To improve the processing frame rate at the edge, we further propose deep object tracking algorithm leveraging adaptive multi-modal models and make it robust to object occlusions and varying lighting conditions. Based on the deep learning based detection and tracking, we can achieve pseudo-30FPS via adaptive key frame selection.more » « less