skip to main content


Title: A Method for Granular Traffic Data Imputation Based on PARATUCK2

Imputing missing data is a critical task in data-driven intelligent transportation systems. During recent decades there has been a considerable investment in developing various types of sensors and smart systems, including stationary devices (e.g., loop detectors) and floating vehicles equipped with global positioning system (GPS) trackers to collect large-scale traffic data. However, collected data may not include observations from all road segments in a traffic network for different reasons, including sensor failure, transmission error, and because GPS-equipped vehicles may not always travel through all road segments. The first step toward developing real-time traffic monitoring and disruption prediction models is to estimate missing values through a systematic data imputation process. Many of the existing data imputation methods are based on matrix completion techniques that utilize the inherent spatiotemporal characteristics of traffic data. However, these methods may not fully capture the clustered structure of the data. This paper addresses this issue by developing a novel data imputation method using PARATUCK2 decomposition. The proposed method captures both spatial and temporal information of traffic data and constructs a low-dimensional and clustered representation of traffic patterns. The identified spatiotemporal clusters are used to recover network traffic profiles and estimate missing values. The proposed method is implemented using traffic data in the road network of Manhattan in New York City. The performance of the proposed method is evaluated in comparison with two state-of-the-art benchmark methods. The outcomes indicate that the proposed method outperforms the existing state-of-the-art imputation methods in complex and large-scale traffic networks.

 
more » « less
Award ID(s):
2027024
NSF-PAR ID:
10378126
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Transportation Research Record: Journal of the Transportation Research Board
Volume:
2676
Issue:
10
ISSN:
0361-1981
Page Range / eLocation ID:
p. 220-230
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the trend of vehicles becoming increasingly connected and potentially autonomous, vehicles are being equipped with rich sensing and communication devices. Various vehicular services based on shared real-time sensor data of vehicles from a fleet have been proposed to improve the urban efficiency, e.g., HD-live map, and traffic accident recovery. However, due to the high cost of data uploading (e.g., monthly fees for a cellular network), it would be impractical to make all well-equipped vehicles to upload real-time sensor data constantly. To better utilize these limited uploading resources and achieve an optimal road segment sensing coverage, we present a real-time sensing task scheduling framework, i.e., RISC, for Resource-Constraint modeling for urban sensing by scheduling sensing tasks of commercial vehicles with sensors based on the predictability of vehicles' mobility patterns. In particular, we utilize the commercial vehicles, including taxicabs, buses, and logistics trucks as mobile sensors to sense urban phenomena, e.g., traffic, by using the equipped vehicular sensors, e.g., dash-cam, lidar, automotive radar, etc. We implement RISC on a Chinese city Shenzhen with one-month real-world data from (i) a taxi fleet with 14 thousand vehicles; (ii) a bus fleet with 13 thousand vehicles; (iii) a truck fleet with 4 thousand vehicles. Further, we design an application, i.e., track suspect vehicles (e.g., hit-and-run vehicles), to evaluate the performance of RISC on the urban sensing aspect based on the data from a regular vehicle (i.e., personal car) fleet with 11 thousand vehicles. The evaluation results show that compared to the state-of-the-art solutions, we improved sensing coverage (i.e., the number of road segments covered by sensing vehicles) by 10% on average. 
    more » « less
  2. The existence of spatiotemporal correlations in traffic behavior on links in a transportation network is potentially very useful. However, traffic metrics are often strongly correlated simply because of natural variations in travel demand patterns and these temporal trends might obstruct more meaningful relationships caused by the physics of traffic. To overcome this challenge, the present paper proposes a non-parametric, moving average detrending method that can be used to remove these background trends, even during non-stationary periods in which traffic states are changing with time. Cross-correlations performed on the detrended data are then used to identify more meaningful trends. The proposed method can also incorporate temporal lags in correlations between individual links, which accounts for the time it takes for information to travel between them. Links that exhibit strong correlations after detrending can then be grouped into communities which behave together using graph theory methods, and this community structure can be leveraged to improve prediction of link performance when information is missing. The proposed methodology is applied to a case study network using real-time link travel speeds obtained from probe vehicles. The results reveal that the 40 links in the network can be grouped into between eight and 12 communities, depending on the day of the week. This suggests that only a handful of links may need to be monitored to estimate travel speeds across the entire network. Furthermore, the significant overlap in the community structure across these days reveals that the network structure plays a large role in spatiotemporal correlations in link travel speeds in a network. 
    more » « less
  3. null (Ed.)
    Network macroscopic fundamental diagrams (MFDs) have recently been shown to exist in real-world urban traffic networks. The existence of an MFD facilitates the modeling of urban traffic network dynamics at a regional level, which can be used to identify and refine large-scale network-wide control strategies. To be useful, MFD-based modeling frameworks require an estimate of the functional form of a network’s MFD. Analytical methods have been proposed to estimate a network’s MFD by abstracting the network as a single ring-road or corridor and modeling the flow–density relationship on that simplified element. However, these existing methods cannot account for the impact of turning traffic, as only a single corridor is considered. This paper proposes a method to estimate a network’s MFD when vehicles are allowed to turn into or out of a corridor. A two-ring abstraction is first used to analyze how turning will affect vehicle travel in a more general network, and then the model is further approximated using a single ring-road or corridor. This approximation is useful as it facilitates the application of existing variational theory-based methods (the stochastic method of cuts) to estimate the flow–density relationship on the corridor, while accounting for the stochastic nature of turning. Results of the approximation compared with a more realistic simulation that includes features that cannot be captured using variational theory—such as internal origins and destinations—suggest that this approximation works to estimate a network’s MFD when turning traffic is present. 
    more » « less
  4. Real-time traffic data at intersections is significant for development of adaptive traffic light control systems. Sensors such as infrared radiation and GPS are not capable of providing detailed traffic information. Compared with these sensors, surveillance cameras have the potential to provide real scenes for traffic analysis. In this research, a You Only Look Once (YOLO)-based algorithm is employed to detect and track vehicles from traffic videos, and a predefined road mask is used to determine traffic flow and turning events in different roads. A Kalman filter is used to estimate and predict vehicle speed and location under the condition of background occlusion. The result shows that the proposed algorithm can identify traffic flow and turning events at a root mean square error (RMSE) of 10. The result shows that a Kalman filter with an intersection of union (IOU)-based tracker performs well at the condition of background occlusion. Also, the proposed algorithm can detect and track vehicles at different optical conditions. Bad weather and night-time will influence the detecting and tracking process in areas far from traffic cameras. The traffic flow extracted from traffic videos contains road information, so it can not only help with single intersection control, but also provides information for a road network. The temporal characteristic of observed traffic flow gives the potential to predict traffic flow based on detected traffic flow, which will make the traffic light control more efficient. 
    more » « less
  5. Human mobility data may lead to privacy concerns because a resident can be re-identified from these data by malicious attacks even with anonymized user IDs. For an urban service collecting mobility data, an efficient privacy risk assessment is essential for the privacy protection of its users. The existing methods enable efficient privacy risk assessments for service operators to fast adjust the quality of sensing data to lower privacy risk by using prediction models. However, for these prediction models, most of them require massive training data, which has to be collected and stored first. Such a large-scale long-term training data collection contradicts the purpose of privacy risk prediction for new urban services, which is to ensure that the quality of high-risk human mobility data is adjusted to low privacy risk within a short time. To solve this problem, we present a privacy risk prediction model based on transfer learning, i.e., TransRisk, to predict the privacy risk for a new target urban service through (1) small-scale short-term data of its own, and (2) the knowledge learned from data from other existing urban services. We envision the application of TransRisk on the traffic camera surveillance system and evaluate it with real-world mobility datasets already collected in a Chinese city, Shenzhen, including four source datasets, i.e., (i) one call detail record dataset (CDR) with 1.2 million users; (ii) one cellphone connection data dataset (CONN) with 1.2 million users; (iii) a vehicular GPS dataset (Vehicles) with 10 thousand vehicles; (iv) an electronic toll collection transaction dataset (ETC) with 156 thousand users, and a target dataset, i.e., a camera dataset (Camera) with 248 cameras. The results show that our model outperforms the state-of-the-art methods in terms of RMSE and MAE. Our work also provides valuable insights and implications on mobility data privacy risk assessment for both current and future large-scale services. 
    more » « less