skip to main content


Title: A Modern Intersection Data Analytics System for Pedestrian and Vehicular Safety
As a part of road safety initiatives, surrogate road safety approaches have gained popularity due to the rapid advancement of video collection and processing technologies. This paper presents an end-to-end software pipeline for processing traffic videos and running a safety analysis based on surrogate safety measures. We developed algorithms and software to determine trajectory movement and phases that, when combined with signal timing data, enable us to perform accurate event detection and categorization in terms of the type of conflict for both pedestrian-vehicle and vehicle-vehicle interactions. Using this information, we introduce a new surrogate safety measure, “severe event,” which is quantified by multiple existing metrics such as time-to-collision (TTC) and post-encroachment time (PET) as recorded in the event, deceleration, and speed. We present an efficient multistage event filtering approach followed by a multi-attribute decision tree algorithm that prunes the extensive set of conflicting interactions to a robust set of severe events. The above pipeline was used to process traffic videos from several intersections in multiple cities to measure and compare pedestrian and vehicle safety. Detailed experimental results are presented to demonstrate the effectiveness of this pipeline.  more » « less
Award ID(s):
1922782
NSF-PAR ID:
10417182
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of 2022 IEEE International Intelligent Transportation Systems Conference (ITSC),
Page Range / eLocation ID:
3117 to 3124
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Skateboarding as a method of transportation has become prevalent, which has increased the occurrence and likelihood of pedestrian–skateboarder collisions and near-collision scenarios in shared-use roadway areas. Collisions between pedestrians and skateboarders can result in significant injury. New approaches are needed to evaluate shared-use areas prone to hazardous pedestrian–skateboarder interactions, and perform real-time, in situ (e.g., on-device) predictions of pedestrian–skateboarder collisions as road conditions vary due to changes in land usage and construction. A mechanism called the Surrogate Safety Measures for skateboarder–pedestrian interaction can be computed to evaluate high-risk conditions on roads and sidewalks using deep learning object detection models. In this paper, we present the first ever skateboarder–pedestrian safety study leveraging deep learning architectures. We view and analyze state of the art deep learning architectures, namely the Faster R-CNN and two variants of the Single Shot Multi-box Detector (SSD) model to select the correct model that best suits two different tasks: automated calculation of Post Encroachment Time (PET) and finding hazardous conflict zones in real-time. We also contribute a new annotated data set that contains skateboarder–pedestrian interactions that has been collected for this study. Both our selected models can detect and classify pedestrians and skateboarders correctly and efficiently. However, due to differences in their architectures and based on the advantages and disadvantages of each model, both models were individually used to perform two different set of tasks. Due to improved accuracy, the Faster R-CNN model was used to automate the calculation of post encroachment time, whereas to determine hazardous regions in real-time, due to its extremely fast inference rate, the Single Shot Multibox MobileNet V1 model was used. An outcome of this work is a model that can be deployed on low-cost, small-footprint mobile and IoT devices at traffic intersections with existing cameras to perform on-device inferencing for in situ Surrogate Safety Measurement (SSM), such as Time-To-Collision (TTC) and Post Encroachment Time (PET). SSM values that exceed a hazard threshold can be published to an Message Queuing Telemetry Transport (MQTT) broker, where messages are received by an intersection traffic signal controller for real-time signal adjustment, thus contributing to state-of-the-art vehicle and pedestrian safety at hazard-prone intersections. 
    more » « less
  2. Learning the human--mobility interaction (HMI) on interactive scenes (e.g., how a vehicle turns at an intersection in response to traffic lights and other oncoming vehicles) can enhance the safety, efficiency, and resilience of smart mobility systems (e.g., autonomous vehicles) and many other ubiquitous computing applications. Towards the ubiquitous and understandable HMI learning, this paper considers both spoken language (e.g., human textual annotations) and unspoken language (e.g., visual and sensor-based behavioral mobility information related to the HMI scenes) in terms of information modalities from the real-world HMI scenarios. We aim to extract the important but possibly implicit HMI concepts (as the named entities) from the textual annotations (provided by human annotators) through a novel human language and sensor data co-learning design.

    To this end, we propose CG-HMI, a novel Cross-modality Graph fusion approach for extracting important Human-Mobility Interaction concepts from co-learning of textual annotations as well as the visual and behavioral sensor data. In order to fuse both unspoken and spoken languages, we have designed a unified representation called the human--mobility interaction graph (HMIG) for each modality related to the HMI scenes, i.e., textual annotations, visual video frames, and behavioral sensor time-series (e.g., from the on-board or smartphone inertial measurement units). The nodes of the HMIG in these modalities correspond to the textual words (tokenized for ease of processing) related to HMI concepts, the detected traffic participant/environment categories, and the vehicle maneuver behavior types determined from the behavioral sensor time-series. To extract the inter- and intra-modality semantic correspondences and interactions in the HMIG, we have designed a novel graph interaction fusion approach with differentiable pooling-based graph attention. The resulting graph embeddings are then processed to identify and retrieve the HMI concepts within the annotations, which can benefit the downstream human-computer interaction and ubiquitous computing applications. We have developed and implemented CG-HMI into a system prototype, and performed extensive studies upon three real-world HMI datasets (two on car driving and the third one on e-scooter riding). We have corroborated the excellent performance (on average 13.11% higher accuracy than the other baselines in terms of precision, recall, and F1 measure) and effectiveness of CG-HMI in recognizing and extracting the important HMI concepts through cross-modality learning. Our CG-HMI studies also provide real-world implications (e.g., road safety and driving behaviors) about the interactions between the drivers and other traffic participants.

     
    more » « less
  3. Video cameras in smart cities can be used to provide data to improve pedestrian safety and traffic management. Video recordings inherently violate privacy, and technological solutions need to be found to preserve it. Smart city applications deployed on top of the COSMOS research testbed in New York City are envisioned to be privacy friendly. This contribution presents one approach to privacy preservation – a video anonymization pipeline implemented in the form of blurring of pedestrian faces and vehicle license plates. The pipeline utilizes customized deeplearning models based on YOLOv4 for detection of privacysensitive objects in street-level video recordings. To achieve real time inference, the pipeline includes speed improvements via NVIDIA TensorRT optimization. When applied to the video dataset acquired at an intersection within the COSMOS testbed in New York City, the proposed method anonymizes visible faces and license plates with recall of up to 99% and inference speed faster than 100 frames per second. The results of a comprehensive evaluation study are presented. A selection of anonymized videos can be accessed via the COSMOS testbed portal. 
    more » « less
  4. Video cameras in smart cities can be used to provide data to improve pedestrian safety and traffic management. Video recordings inherently violate privacy, and technological solutions need to be found to preserve it. Smart city applications deployed on top of the COSMOS research testbed in New York City are envisioned to be privacy friendly. This contribution presents one approach to privacy preservation – a video anonymization pipeline implemented in the form of blurring of pedestrian faces and vehicle license plates. The pipeline utilizes customized deeplearning models based on YOLOv4 for detection of privacysensitive objects in street-level video recordings. To achieve real time inference, the pipeline includes speed improvements via NVIDIA TensorRT optimization. When applied to the video dataset acquired at an intersection within the COSMOS testbed in New York City, the proposed method anonymizes visible faces and license plates with recall of up to 99% and inference speed faster than 100 frames per second. The results of a comprehensive evaluation study are presented. A selection of anonymized videos can be accessed via the COSMOS testbed portal. Index Terms—Smart City, Sensors, Video Surveillance, Privacy Protection, Object Detection, Deep Learning, TensorRT. 
    more » « less
  5. Video cameras in smart cities can be used to provide data to improve pedestrian safety and traffic management. Video recordings inherently violate privacy, and technological solutions need to be found to preserve it. Smart city applications deployed on top of the COSMOS research testbed in New York City are envisioned to be privacy friendly. This contribution presents one approach to privacy preservation– a video anonymization pipeline implemented in the form of blurring of pedestrian faces and vehicle license plates. The pipeline utilizes customized deeplearning models based on YOLOv4 for detection of privacysensitive objects in street-level video recordings. To achieve real time inference, the pipeline includes speed improvements via NVIDIA TensorRT optimization. When applied to the video dataset acquired at an intersection within the COSMOS testbed in New York City, the proposed method anonymizes visible faces and license plates with recall of up to 99% and inference speed faster than 100 frames per second. The results of a comprehensive evaluation study are presented. A selection of anonymized videos can be accessed via the COSMOS testbed portal. Index Terms—Smart City, Sensors, Video Surveillance, Privacy Protection, Object Detection, Deep Learning, TensorRT. 
    more » « less